World Community Grid - View Thread - OpenPandemics

World Community Grid Forums

Category: Active Research

Forum: OpenPandemics - COVID-19 Project

Thread: OpenPandemics - GPU Stress Test

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 781

[ ]

Author

This topic has been viewed 164664 times and has 780 replies

siu77
Cruncher
Russia
Joined: Mar 12, 2012
Post Count: 19
Status: Offline
Project Badges:

1 year badge for The Clean Energy Project - Phase 2

45 day badge for Computing for Clean Water

45 day badge for Drug Search for Leishmaniasis

14 day badge for GO Fight Against Malaria

10 year badge for Mapping Cancer Markers

90 day badge for Uncovering Genome Mysteries

1 year badge for Outsmart Ebola Together

90 day badge for FightAIDS@Home - Phase 2

180 day badge for Smash Childhood Cancer

1 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project

180 day badge for OpenPandemics - COVID-19


Re: OpenPandemics - GPU Stress Test

Quite a long time ago I watched a video of a developer from another project - Poem@Home, where he explained the GPGPU "for dummies" in this way. Unfortunately I can't find it.

Based on your words, it seems now that the calculations have become more advanced. But, I still don’t understand what “same line of code” means.

Could you explain this in more detail? But for dummies. :)

Also, there is no such thing as a "cell" that you describe. GPGPUs have RAM, caches, and cores just like a CPU. The difference is that there are cores grouped into work groups that are running in parallel (Nvidia calls these work groups "warps").

The memory cell was meant. The memory area. Whatever...

[May 5, 2021 2:35:42 PM]

ca05065
Senior Cruncher
Joined: Dec 4, 2007
Post Count: 325
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

90 day badge for Discovering Dengue Drugs - Together

90 day badge for Nutritious Rice for the World

1 year badge for Help Fight Childhood Cancer

45 day badge for Influenza Antiviral Drug Search

180 day badge for Help Cure Muscular Dystrophy - Phase 2

180 day badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for The Clean Energy Project - Phase 2

1 year badge for Computing for Clean Water

1 year badge for Drug Search for Leishmaniasis

1 year badge for GO Fight Against Malaria

14 day badge for Computing for Sustainable Water

2 year badge for Uncovering Genome Mysteries

2 year badge for Outsmart Ebola Together

5 year badge for FightAIDS@Home - Phase 2

5 year badge for Microbiome Immunity Project

10 year badge for OpenPandemics - COVID-19


Re: OpenPandemics - GPU Stress Test

The last OPNG work unit in my queue has just completed. The work request immediately changed from requesting CPU and GPU work to only requesting CPU work. The device settings have not changed since 29th January.

[May 5, 2021 3:04:43 PM]

PMH_UK
Veteran Cruncher
UK
Joined: Apr 26, 2007
Post Count: 741
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

1 year badge for Discovering Dengue Drugs - Together

2 year badge for Nutritious Rice for the World

1 year badge for The Clean Energy Project

2 year badge for Help Fight Childhood Cancer

180 day badge for Influenza Antiviral Drug Search

2 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

20 year badge for Mapping Cancer Markers

10 year badge for Uncovering Genome Mysteries

10 year badge for Outsmart Ebola Together

10 year badge for FightAIDS@Home - Phase 2

10 year badge for Smash Childhood Cancer

10 year badge for Microbiome Immunity Project

5 year badge for Africa Rainfall Project


Re: OpenPandemics - GPU Stress Test

Suggest check/save settings even if appear good, there have been some odd issues.
My log still shows:
Requesting new tasks for CPU and Intel GPU
...
No tasks are available for OpenPandemics - COVID-19 - GPU

Paul.

----------------------------------------

Paul.

[May 5, 2021 3:18:18 PM]

ca05065
Senior Cruncher
Joined: Dec 4, 2007
Post Count: 325
Status: Offline
Project Badges:


Re: OpenPandemics - GPU Stress Test

The device settings were as expected with all graphics card usage sections set to 'yes'. I set them all to 'no' and saved them. I returned all settings to 'yes' and saved again. I forced several update requests on my PC but it still only requests CPU work units.

[May 5, 2021 8:28:47 PM]

PMH_UK
Veteran Cruncher
UK
Joined: Apr 26, 2007
Post Count: 741
Status: Offline
Project Badges:


Re: OpenPandemics - GPU Stress Test

Maybe something crashed.
Does the GPU still appear in BOINC or clinfo?

Paul.

----------------------------------------

Paul.

[May 5, 2021 8:47:06 PM]

goben_2003
Advanced Cruncher
Joined: Jun 16, 2006
Post Count: 145
Status: Offline
Project Badges:

14 day badge for Human Proteome Folding - Phase 2

14 day badge for The Clean Energy Project - Phase 2

2 year badge for Microbiome Immunity Project

20 year badge for OpenPandemics - COVID-19


Re: OpenPandemics - GPU Stress Test

Time is as time does. If you contribute by processing 100 GPU WUs at 30 seconds each then you have done 50 minutes processing.

To say that if you had done them on the CPU it would have taken 150 hours is ridiculous.

If we are talking about how fast a core is, yes. A raspberry pi is much slower than many other CPUs. It has 4 cores and is able to get roughly 4 days of time credit just like some earlier 2 core/4 thread intel i5s.

The issue here though is not about speed. It is about # of cores.
With a CPU you get Time spent x number of cores(really threads), not just 1 hour per CPU. The GPUs are also multicore, but time credit pretends they only have 1 core. It should be per core just like CPUs.

----------------------------------------

[May 5, 2021 8:57:04 PM]

goben_2003
Advanced Cruncher
Joined: Jun 16, 2006
Post Count: 145
Status: Offline
Project Badges:


Re: OpenPandemics - GPU Stress Test

I also agree that it would be great if we did most of the work on the GPUs. AFAIK/IIRC, the current GPU and CPU work units are very similar with the main difference being that GPU's pack much, much more work. In the future, the researchers hope to use the GPUs for the more complex molecules which would take much more time on CPUs than they would on GPUs.

CPU isn't going away but it would certainly be nice if, at some point, GPU's represented the bulk of the work being in OPNG and that the freed up CPU power went to other projects.

Edit: There is a post from one of the researchers that describes the differences in CPU vs GPU . It's a bit old now (almost 1 year old) but I'm guessing it's still valid.

It would be a shame if WCG would continue to push OPN CPU workunits if:
- OPN and OPNG workunits are fundamentally doing exactly the same (same functionality)
- The performance of OPNG is X times better than OPN (==> that's a fact)

I don't know if the first condition is true, but if it is then it would be a waste of resources to do OPN works with CPU's. There are other projects hunting for CPU power which cannot be converted to GPU. It's a matter of using the limited resources in the most effective way...

If I understand correctly, the first condition is true. Here is something from one of the betas.

Hi uplinger,

is each step in the GPU WUs comparable to an "old" OPN CPU WU or are these steps smaller?

thanks!

They are equivalent to the CPU work units. They solve the same problem just a different way. But for each CPU work unit, they would run multiple ligands in a work unit (sometimes less if they were difficult). For GPU they run lots of ligands in a single work unit due to the speed of the calculations.

Thanks,
-Uplinger

----------------------------------------

[May 5, 2021 9:00:13 PM]

ca05065
Senior Cruncher
Joined: Dec 4, 2007
Post Count: 325
Status: Offline
Project Badges:


Re: OpenPandemics - GPU Stress Test

@PMH_UK
I have re-booted my PC.
I have performed a BOINC install repair - this has resulted in BOINC using the stdoutdea.txt file for the first time since 31st March.
The GPU is seen by BOINC:
31-Mar-2021 15:10:52 [---] CUDA: NVIDIA GPU 0: GeForce GT 1030 (driver version 461.92, CUDA version 11.2, compute capability 6.1, 2048MB, 1661MB available, 1127 GFLOPS peak)
31-Mar-2021 15:10:52 [---] OpenCL: NVIDIA GPU 0: GeForce GT 1030 (driver version 461.92, device version OpenCL 1.2 CUDA, 2048MB, 1661MB available, 1127 GFLOPS peak)
The work request is still only requesting CPU work:
05-May-2021 23:23:45 [World Community Grid] [sched_op] Starting scheduler request
05-May-2021 23:23:45 [World Community Grid] Sending scheduler request: To fetch work.
05-May-2021 23:23:45 [World Community Grid] Reporting 1 completed tasks
05-May-2021 23:23:45 [World Community Grid] Requesting new tasks for CPU
05-May-2021 23:23:45 [World Community Grid] [sched_op] CPU work request: 3662600.76 seconds; 0.00 devices
05-May-2021 23:23:45 [World Community Grid] [sched_op] NVIDIA GPU work request: 0.00 seconds; 0.00 devices
05-May-2021 23:23:46 [World Community Grid] Scheduler request completed: got 1 new tasks
05-May-2021 23:23:46 [World Community Grid] [sched_op] Server version 701
05-May-2021 23:23:46 [World Community Grid] Project requested delay of 121 seconds
05-May-2021 23:23:46 [World Community Grid] [sched_op] estimated total CPU task duration: 8581 seconds
05-May-2021 23:23:46 [World Community Grid] [sched_op] estimated total NVIDIA GPU task duration: 0 seconds
05-May-2021 23:23:46 [World Community Grid] [sched_op] handle_scheduler_reply(): got ack for task OPN1_0045854_03668_0
05-May-2021 23:23:46 [World Community Grid] [sched_op] Deferring communication for 00:02:01
05-May-2021 23:23:46 [World Community Grid] [sched_op] Reason: requested by project

Is there anything else to check?

[May 5, 2021 10:29:34 PM]

poppinfresh99
Cruncher
Joined: Feb 29, 2020
Post Count: 49
Status: Offline
Project Badges:

1 year badge for Africa Rainfall Project

5 year badge for OpenPandemics - COVID-19


Re: OpenPandemics - GPU Stress Test

The memory cell was meant. The memory area. Whatever...

I see now what you meant by "cell". By the way, RAM is a bit more complicated on a GPU than on a CPU. Each thread has its own RAM, and each work group (of, let's say, 32 threads being run in parallel) has its own RAM, and then there is the global RAM for the whole GPU (sometimes shared with CPU). It's even more complicated than this!

For the following meaningless very-simple GPU code, a[], b[], c[], and d[] are arrays stored in GPU global RAM, and id is the id of the thread...
1: x = a[id] * b[id] - c[id]
2: if (x > 0)
3: y = 2 * x
4: else y = 0
5: d[id] = y
All 32 threads in the work group will do line 1 at the same time on their own data.
Then all threads will do line 2.
The threads that then need to run line 3 will do that while the others wait. However, if all 32 threads have x <= 0, then line 3 is skipped.
If any threads have x <= 0, they do line 4 while the other threads wait.
Finally, all threads write to the output array d[] at the same.
Anyway, x and y are stored in the RAM for each thread. The RAM shared with a work group is not used here, and it is usually filled with lookup tables that are either calculated when the work group starts or moved from global RAM when the work group starts.

[May 6, 2021 3:34:54 AM]

Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 309
Status: Offline
Project Badges:

14 day badge for FightAIDS@Home - Phase 2


Re: OpenPandemics - GPU Stress Test

Duh, yes.

As I said, the sum of the time taken by all of the GPU WUs against the time taken by all of the CPU WUs not the number of GPU WUs times the time that would have been taken had they been CPU WUs which is what the poster appeared to be saying.

Certainly, if you do 100 GPU WUs and they take 30 seconds each you should get 50 minutes credit and not 25 because you were doing two at a time but that was not how the post I responded to was phrased “but that could be gotten round by allocating notional time credits to GPU units that are equivalent to time to process as CPU only”.

----------------------------------------
[Edit 2 times, last edit by Bryn Mawr at May 6, 2021 4:09:37 AM]

[May 6, 2021 4:02:59 AM]

[ ]