Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 77
Posts: 77   Pages: 8   [ Previous Page | 1 2 3 4 5 6 7 8 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 240221 times and has 76 replies Next Thread
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU is LIVE!

I don't know why WCG didn't test that app but it exists and it's on their github.


Best guess would be the researchers never sent tasks for the CUDA app to beta test.
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


[Apr 8, 2021 5:29:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU is LIVE!

There are a few reasons the CUDA version was not tested or used.

From early on, having only one version to maintain is a lot easier for testing and general support. Thus, the focus on OpenCL allows all 3 GPU types to participate instead of just one.

Due to this focus, the OpenCL version is actually faster than the CUDA version.

Thanks,
-Uplinger
[Apr 8, 2021 5:48:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU is LIVE!

There are a few reasons the CUDA version was not tested or used.

From early on, having only one version to maintain is a lot easier for testing and general support. Thus, the focus on OpenCL allows all 3 GPU types to participate instead of just one.

Due to this focus, the OpenCL version is actually faster than the CUDA version.

Thanks,
-Uplinger


I totally understand ease of support having to only manage and maintain a single code base. but I don't think you can definitively say that the OpenCL version is faster than the CUDA version if you never even tested it.

if you mean "faster" in terms of faster completion of the entire project by considering having a larger user-base, and not just that the nvidia-opencl is faster than nvidia-cuda (which i would strongly contest), then you'd need to weigh the speedup of CUDA vs how many AMD and Intel devices would cease to contribute. do you have any statistics to share about what percentage of total flops the project sees from each device type? without knowing that, you really can't make that kind of conclusion.

if you don't want to complicate your support efforts with two app versions for different device types, like I said, I understand. but the bigger issue a lot of us have is with the constant 0-100 behavior. and that can be better optimized. I hope the team is considering improvements rather than taking the "it's good enough" approach.
----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti
[Apr 8, 2021 6:39:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 346
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU is LIVE!

I totally understand ease of support having to only manage and maintain a single code base. but I don't think you can definitively say that the OpenCL version is faster than the CUDA version if you never even tested it.


There are many layers of testing involved before you get to beta and one of those is performance testing so yes, they would know definitively which is faster.
[Apr 8, 2021 8:39:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
mhammond
Advanced Cruncher
USA
Joined: Dec 22, 2011
Post Count: 130
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU is LIVE!

I was really looking forward to GPU running tasks on my laptops but I will be opting out after trying it. Over the last few days, ever since running on GPU, my laptop frequently and without warning powers off. And it isn't crashing either; when restarting it does not prompt with the usual safe-mode option that are offered with unexpected power downs. I have 2 Africa rainfall tasks that have been running for almost 3 days now due to their infrequent check points combined with the shut downs and my last two days results have been lower than any 2 day period for years.

I looked for anyone else reporting this issue and didn't see it; don't have time to weed through everything looking and not sure if this is the best place or thread but wanted to get it out there.

Will gladly and happily try again in the future.

regards,
mike
----------------------------------------

[Apr 8, 2021 8:49:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 277
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU is LIVE!

if you don't want to complicate your support efforts with two app versions for different device types, like I said, I understand. but the bigger issue a lot of us have is with the constant 0-100 behavior. and that can be better optimized. I hope the team is considering improvements rather than taking the "it's good enough" approach.


I managed to catch my system running a batch of GPU units, and saw that when one finished, the next immediately started. This is on a GTX 960, for what it's worth. Interestingly, I see 100% GPU utilization, but less than half of the maximum power consumption, and the fan loafs along at 22%, with units completing in 4-6 minutes or so. I'm not overclocking it, or at least I'm not trying to do so.

ETA: I recall seeing about 90-95 W back when I was running GPUGRID.

My nvidia-smi output with an OPNG work unit running:

|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 960 Off | 00000000:0E:00.0 On | N/A |
| 22% 68C P2 74W / 160W | 216MiB / 4043MiB | 100% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 3307 G /usr/lib/xorg/Xorg 16MiB |
| 0 N/A N/A 3665 G /usr/bin/gnome-shell 8MiB |
| 0 N/A N/A 4522 C ...ux-gnu__opencl_nvidia_102 130MiB |
| 0 N/A N/A 7125 G /usr/lib/xorg/Xorg 53MiB |
+-----------------------------------------------------------------------------+

----------------------------------------
[Edit 1 times, last edit by spRocket at Apr 8, 2021 9:00:42 PM]
[Apr 8, 2021 8:54:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
PMH_UK
Veteran Cruncher
UK
Joined: Apr 26, 2007
Post Count: 772
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU is LIVE!

My laptop goes into standby when running GPU units if the lid is closed.
(I use external monitor via KVM switch).
When not running GPU units in runs OK.

Paul.
----------------------------------------
Paul.
[Apr 8, 2021 9:14:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Jorlin
Advanced Cruncher
Deutschland
Joined: Jan 22, 2020
Post Count: 89
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU is LIVE!


I managed to catch my system running a batch of GPU units, and saw that when one finished, the next immediately started. This is on a GTX 960, for what it's worth. Interestingly, I see 100% GPU utilization, but less than half of the maximum power consumption, and the fan loafs along at 22%, with units completing in 4-6 minutes or so. I'm not overclocking it, or at least I'm not trying to do so.

ETA: I recall seeing about 90-95 W back when I was running GPUGRID.

My nvidia-smi output with an OPNG work unit running:


The 960 is already a very cool running card.
Yes, the power consumption / heat production on these WUs is quite low.
Running two simultaneously on a GTX960 and three on a 1050Ti. Core at 100% but running cooler than a single Primegrid task which i cut off of 50% cpu ressources (so it's not running at 100% Core since it has to wait for the cpu to feed it).
----------------------------------------

[Apr 8, 2021 9:25:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU is LIVE!

Nvidia-smi only provides instantaneous output. Watch the load fluctuation with nvidia settings, or use nvidia-smi with arguments to poll at a higher rate and dump the output to a file. You’ll see GPU load fluctuating from 0-100% constantly.

From what others have mentioned this because the WUs are prepackaged with multiple WUs in a single package. If this is the case, they could use something like a mutex lock to preload data and prepare for the next computation while the current one is ongoing. This will allow the GPU to remain at near 100% the entire time. We did this with the SETI CUDA app and recorded as low as 1ms (probably the limits of our measurement ability) between one WU and the next.
----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti
[Apr 8, 2021 9:32:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bozz4science
Advanced Cruncher
Germany
Joined: May 3, 2020
Post Count: 104
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU is LIVE!

Yeah, that's indeed how the 0-100 fluctuations are caused. You can easily validate this by looking on your results page then navigate --> drop down menue projects --> Open Pandemics --> click on the "VALID" hyperlink on one of those computed OPNG tasks and you'll see that multiple jobs are packed in one WU along with the runtime for each with a short log in between.
----------------------------------------

AMD Ryzen 3700X @ 4.0 GHz / GTX1660S
Intel i5-4278U CPU @ 2.60GHz
[Apr 8, 2021 10:12:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 77   Pages: 8   [ Previous Page | 1 2 3 4 5 6 7 8 | Next Page ]
[ Jump to Last Post ]
Post new Thread