Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 822
|
![]() |
Author |
|
kittyman
Advanced Cruncher Joined: May 14, 2020 Post Count: 140 Status: Offline Project Badges: ![]() ![]() ![]() |
Ok, I think I have everything back to normal now...I accidentally changed the number loaded to 200000 work units, this was meant to just be changed to 2000. This reflects the number of average work units in a batch so that we can maintain the 200 batches per day for OPNG. Again, I apologize for the confusion this may have caused... Thanks, -Uplinger No worries! The kitties forgive you for sure. They were starting to worry that my GPU had broken because it wasn't doing anything. (I don't think the kitties would mind if you duplicated this mistake once a day....LOL.) ![]() |
||
|
Ian-n-Steve C.
Senior Cruncher United States Joined: May 15, 2020 Post Count: 180 Status: Offline Project Badges: ![]() |
Ok, I think I have everything back to normal now...I accidentally changed the number loaded to 200000 work units, this was meant to just be changed to 2000. This reflects the number of average work units in a batch so that we can maintain the 200 batches per day for OPNG. Again, I apologize for the confusion this may have caused... Thanks, -Uplinger How long do you think until more batches per day are released? Or larger batches. Whatever gets us to the point of having the GPUs consistently busy. Is the limiting factor the server load? How did the server fare during this mistake? Or is the limiting factor just an effort to keep CPU work available for CPU users? If that’s the case, I have to wonder why you’d intentionally delay the science results just to placate slower devices. ![]() EPYC 7V12 / [5] RTX A4000 EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060 [2] EPYC 7642 / [2] RTX 2080Ti |
||
|
ericcui1
Cruncher Joined: Aug 26, 2017 Post Count: 5 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() |
Ok, I think I have everything back to normal now...I accidentally changed the number loaded to 200000 work units, this was meant to just be changed to 2000. This reflects the number of average work units in a batch so that we can maintain the 200 batches per day for OPNG. Again, I apologize for the confusion this may have caused... Thanks, -Uplinger Well people here are more worried about not getting any. So, a happy mistake! |
||
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
If that’s the case, I have to wonder why you’d intentionally delay the science results just to placate slower devices That's why Keith is the lead tech and you're not. ![]()
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
![]() ![]() |
||
|
dkapetansky
Cruncher Joined: Jun 23, 2011 Post Count: 25 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Since this impromptu stress-test has occurred- and the world didn't end- would it make any sense just to leave the configuration parameter at the higher value, and unleash the grid to do what it does best? Who would complain if project progress surged ahead?
|
||
|
Ian-n-Steve C.
Senior Cruncher United States Joined: May 15, 2020 Post Count: 180 Status: Offline Project Badges: ![]() |
If that’s the case, I have to wonder why you’d intentionally delay the science results just to placate slower devices That's why Keith is the lead tech and you're not. ![]() because he's the authority and you're not, is precisely why I asked him and not you ![]() so let's set the ad hominem aside and let him speak for himself. ![]() EPYC 7V12 / [5] RTX A4000 EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060 [2] EPYC 7642 / [2] RTX 2080Ti |
||
|
Ian-n-Steve C.
Senior Cruncher United States Joined: May 15, 2020 Post Count: 180 Status: Offline Project Badges: ![]() |
Since this impromptu stress-test has occurred- and the world didn't end- would it make any sense just to leave the configuration parameter at the higher value, and unleash the grid to do what it does best? Who would complain if project progress surged ahead? say it louder for the people in the back. ![]() EPYC 7V12 / [5] RTX A4000 EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060 [2] EPYC 7642 / [2] RTX 2080Ti |
||
|
Grumpy Swede
Master Cruncher Svíþjóð Joined: Apr 10, 2020 Post Count: 2154 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Look at your invalids, and server aborted first.......
----------------------------------------Things did not end well This is not normal: https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=618783001 And: https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=618613464 And there's tons more out there. [Edit 1 times, last edit by Grumpy Swede at Apr 14, 2021 2:20:18 AM] |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Good evening,
Our plan is to maintain the pace (which was increased to 2000 work units every 30 minutes) so that we do not over load the pipeline. All of this requires consideration and planning. For example, creating too many results at one time could over load a database, it could cause file upload handler to become overloaded, it could fill up the researchers servers, etc....There are lots of parts that need to be considered on everything. Some of the things we need to consider could cause harm to other projects running here on World Community Grid. As with many of our research projects we start them at a pace that makes sense with us and the researchers. This heavy scheduling did however bring to light an issue with scheduler that I will need to fix going forward. Thanks, -Uplinger |
||
|
Ian-n-Steve C.
Senior Cruncher United States Joined: May 15, 2020 Post Count: 180 Status: Offline Project Badges: ![]() |
thanks for the reply. sounds like you'll continue to increase WU availability as your confidence in your process and system stability grows, and that's good to hear.
----------------------------------------you could also likely leave the WU distribution quantity the same (2000/30mins) and just increase the WU size (more jobs per WU, or harder jobs per WU) to increase the work being done without negatively impacting your infrastructure. that's kind of a win-win. ![]() EPYC 7V12 / [5] RTX A4000 EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060 [2] EPYC 7642 / [2] RTX 2080Ti |
||
|
|
![]() |