Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 162
|
![]() |
Author |
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Back when the HCC GPU app was in use Nvidia was lagging far behind AMD when it came to OpenCl implementation because they were developing their CUDA app. Their DP capabilities were also less then half that of AMD except on their very high end industrial cards. I haven't kept up with all that but hopefully they have moved their OpenCl development along to be more useful. That being said it would not surprise me that some of their cards will not run this app even if claimed they were 1.2 compatible.
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
![]() ![]() |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1323 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
On a low low end card: OpenCL: AMD/ATI GPU 0: AMD Radeon(TM) R3 Graphics (Mullins)
BETA_OPNG_0021068_00061 - 54 ligands processed without interruption CPU time 1183 seconds - Elapsed time 5.25 hours BETA_OPNG_0021068_00001 - 68 ligands processed restarted twice from saved checkpoints First restart just to test. Second restart because of graphics driver crashed and although task kept running, there was no real progress. Since I've checkpoint_debug enabled in cc_config.xml, I easily discovered no progress during job 41. Cpu time 1952 seconds - Elapsed time 5.19 hours BETA_OPNG_0021068_00007 - 46 ligands processed without interruption CPU time 1082 seconds - Elapsed time 4.12 hours BETA_OPNG_0021068_00012 - 65 ligands just started . . . |
||
|
bozz4science
Advanced Cruncher Germany Joined: May 3, 2020 Post Count: 104 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Yes. I am running three per GPU on AMD Radeon HD 7990 rig with 8 GPUs. No failures. Perfect. As soon as the production ready version of the GPU is deployed, I will give it a try by starting with 2 concurrent WUs at a time. Yes. It won't help it. I would never EVER overclock on a critical scientific project like this one. Upon reflection on my prior question, I see this now more differentiated than before. Especially as the runtimes are really short and the sub-tasks that place the short few-seconds long 100% load bursts on the GPU are really not the best use case for OC to increase WU throughput. By letting 2 WUs (or any n>1 WU) compute concurrently, I think that the GPU load can be more evenly balanced and sustained at a high rate. I would never overclock like crazy, but usually apply only a memory clock offset to revert the penalty that NVIDIA places on compute workloads in P0 power state to the memory clock. Ever so slightly, I might slightly adjust the OC on core clock upwards to see what that does to runtimes. I think GPU OC is also a common practice on other GPU-distributed computing projects, no? And out of curiosity. If someone were to OC their GPUs like crazy based on some benchmark testing for gaming let's say for the sake of the argument, that won't ever turn out to be stable for a GPU-compute workload, wouldn't those WUs result in an error and be assigned an 'invalid' flag by the WU validator? The researchers are planning to send out more difficult ligands for GPU which from my recommendation should sit around 5 minutes per ligand on average. That's awesome. Looking forward to see my GPUs sweat. What is being written is very small amount of data, this should not wear out an SSD. That's reassuring. Didn't really check for that when my last beta WUs were computed.![]() AMD Ryzen 3700X @ 4.0 GHz / GTX1660S Intel i5-4278U CPU @ 2.60GHz [Edit 2 times, last edit by bozz4science at Mar 2, 2021 4:59:42 PM] |
||
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Going to agree with Jim 1348 about overclocking the GPUs. Very little gain if any for the increased power consumption and heat. Experience says there will be errors.
----------------------------------------I would never overclock like crazy, but usually apply only a memory clock offset to revert the penalty that NVIDIA places on compute workloads in P0 power state to the memory clock. Ever so slightly, I might slightly adjust the OC on core clock upwards to see what that does to runtimes. I think GPU OC is also a common practice on other GPU-distributed computing projects, no? None that I know of currently. And out of curiosity. If someone were to OC their GPUs like crazy based on some benchmark testing for gaming let's say for the sake of the argument, that won't ever turn out to be stable for a GPU-compute workload, wouldn't those WUs result in an error and be assigned an 'invalid' flag by the WU validator? They would be flagged as errors not invalids. Tasks flagged as invalid come from those tasks being compared to known valid tasks and the results don't match. Even if flagged as invalid those tasks run until completion. A task that errors out will stop at the error point and not complete.
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
----------------------------------------![]() ![]() [Edit 1 times, last edit by nanoprobe at Mar 2, 2021 5:28:18 PM] |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hello all,
I have turned on the validator. I noticed almost instantly that there was an issue, however about 197 results were marked as invalid incorrectly. I stopped the validation, fixed the bug and started it up again. I have marked the results that were invalid to be rerun for validation and they should clean up. If you notice any weirdness on your results, please bring them to my attention so I can review the logs and the results. Thanks, -Uplinger |
||
|
bozz4science
Advanced Cruncher Germany Joined: May 3, 2020 Post Count: 104 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for keeping us up to date Uplinger! Curious to see if all my PV WUs will change their status to valid.
----------------------------------------Over at GPU Grid there is a vivid exchange of all things 'overclocking'. Comparing risks and benefits, personal experiences, best practices, etc. It is in another context, sure, but it is still a GPU computing project. Some of the older crunchers over there who have been participating for years do overclock by means of at least setting an offset for the memory clock with NVIDIA cards to get back to "effectively" the default stock memory clock settings. (Due to the penalty received by the switched power state) ![]() AMD Ryzen 3700X @ 4.0 GHz / GTX1660S Intel i5-4278U CPU @ 2.60GHz |
||
|
Grumpy Swede
Master Cruncher Svíþjóð Joined: Apr 10, 2020 Post Count: 2214 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
At the moment not so concentrated on WCG. I noticed that I had lots of dustbunnies in the corners, so for now it's time to grab the vacuum cleaner. And while I'm at it, I might as well mop the floors too.
----------------------------------------![]() Back later. Edit, added before I start. How can this one of mine be marked as "Too Late", when I finished it very much in time, and all the others errored out on the WU? https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=538631913 [Edit 1 times, last edit by Grumpy Swede at Mar 2, 2021 6:00:44 PM] |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1323 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I noticed almost instantly that there was an issue, however about 197 results were marked as invalid incorrectly. I stopped the validation, fixed the bug and started it up again. Thanks, -Uplinger I suppose from these In(in)valids resends were sent out asap . . . |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I noticed almost instantly that there was an issue, however about 197 results were marked as invalid incorrectly. I stopped the validation, fixed the bug and started it up again. Thanks, -Uplinger I suppose from these In(in)valids resends were sent out asap . . . Yes there were some. Couldn't stop them fast enough. Thanks, -Uplinger |
||
|
Vester
Senior Cruncher USA Joined: Nov 18, 2004 Post Count: 325 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have 11 pending validation because the other computers have not returned their results.
----------------------------------------BETA_OPNG_0021099_00068_1 4HD7990 Pending Validation 3/2/21 01:30:03 3/2/21 02:08:58 0.41 / 0.44 2.6 / 0.0 BETA_OPNG_0021077_00214_1 4HD7990 Pending Validation 3/2/21 00:47:42 3/2/21 01:16:13 0.04 / 0.23 2.6 / 0.0 BETA_OPNG_0021073_00163_1 4HD7990 Pending Validation 3/2/21 00:39:24 3/2/21 01:06:50 0.06 / 0.28 2.6 / 0.0 BETA_OPNG_0021073_00181_1 4HD7990 Pending Validation 3/2/21 00:39:24 3/2/21 01:02:39 0.04 / 0.21 2.6 / 0.0 BETA_OPNG_0021067_00122_0 4HD7990 Pending Validation 3/2/21 00:28:59 3/2/21 00:47:41 0.05 / 0.22 2.6 / 0.0 BETA_OPNG_0021064_00042_0 4HD7990 Pending Validation 3/2/21 00:22:42 3/2/21 00:37:21 0.04 / 0.22 2.6 / 0.0 BETA_OPNG_0021064_00036_0 4HD7990 Pending Validation 3/2/21 00:22:41 3/2/21 00:33:08 0.03 / 0.14 2.6 / 0.0 BETA_OPNG_0021039_00300_1 4HD7990 Pending Validation 3/1/21 23:29:25 3/1/21 23:35:36 0.02 / 0.10 494.7 / 0.0 BETA_OPNG_0021036_00192_1 4HD7990 Pending Validation 3/1/21 23:17:23 3/1/21 23:31:30 0.03 / 0.14 2.6 / 0.0 BETA_OPNG_0021036_00107_1 4HD7990 Pending Validation 3/1/21 23:17:23 3/1/21 23:33:32 0.04 / 0.19 2.6 / 0.0 BETA_OPNG_0021034_00153_0 4HD7990 Pending Validation 3/1/21 23:10:54 3/1/21 23:23:39 0.04 / 0.20 2.6 / 0.0 ![]() |
||
|
|
![]() |