Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 781
|
![]() |
Author |
|
hnapel
Advanced Cruncher Netherlands Joined: Nov 17, 2004 Post Count: 82 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Good morning, I should have posted sooner, since I've been up for 2 hours. We are still working on tweaking some of the load balancer values. It seems like things are running a bit smoother right now. We are only in about 20 minutes on the latest changes. Please let us know if you are noticing anything on your end. Thanks, -Uplinger Uploads going smoother now, I've got 4 PC's (with GPU's) running on this project and I just 'retried' all pending uploads and they all went through, it certainly looks quieter on the upload front rn. |
||
|
spRocket
Senior Cruncher Joined: Mar 25, 2020 Post Count: 274 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() |
So far, so good here as well. Just turned in a stack of "big" OPNG units and got a stack of mixed units back with no hiccups.
|
||
|
Michael Goetz
Cruncher United States Joined: Dec 11, 2017 Post Count: 35 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
"c:\program files\boinc\boinccmd.exe" --network_available Then add loop/repeat controls as appropriate to your desires and scripting language. Thanks for the tip! does that also retry stalled transfers ? Yes. That's the whole point. :) Also, if you're using BOINCTasks, and right click on any file transfers, there's an option labelled "Retry All". There's several BOINC features that, strangely, are not supported by the official BOINC GUI interface but are supported by third party GUIs such as BOINCTasks. This is one of them. [Edit 1 times, last edit by Michael Goetz at Apr 27, 2021 2:28:14 PM] |
||
|
Richard Haselgrove
Senior Cruncher United Kingdom Joined: Feb 19, 2021 Post Count: 360 Status: Offline Project Badges: ![]() ![]() |
'Retry Pending Transfers' is available in the official BOINC Manager, but it's a menu item rather than a button.
Advanced view, Tools menu. |
||
|
Ian-n-Steve C.
Senior Cruncher United States Joined: May 15, 2020 Post Count: 180 Status: Offline Project Badges: ![]() |
One Ellesmere with HDD and one Ellesmere with SSD. The SSD one is crashing very often, because of a lot Checkpoints. Boinc is set to 1200 sec. for backup, but OPNG ignore this. Now are the longrunning OPNG-Tasks running on it (1 hour!). Something is wrong with checkpointing and SSD. https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=639284992 Either something is wrong with your SSD, or something else is wrong with the system with the SSD. My systems are much faster and running 6-8 GPUs and producing many more writes to the SSD, but with no issues. SSDs in general are capable of many orders of magnitude more IOPs than a HDD. Your problem is likely system-specific, not SSD-specific. No problem with Einstein@Home! And Einstein has longer running tasks which might not expose the issues with your SSD. You can’t really compare apples and oranges. Like I said, I’m processing at a MUCH higher volume on OPNG, with no SSD issues. If it was a generic SSD issue, someone like me with many more writes would see this issue too, but we don’t. That points to your issue being related to something with your system specifically. ![]() EPYC 7V12 / [5] RTX A4000 EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060 [2] EPYC 7642 / [2] RTX 2080Ti |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
While I have a few minutes, I'm goin gto go back through about 200 posts from everyone :) That may be the biggest bottleneck :P Also, no worries on putting more stress on the system. That's what all this fun is about....
<sarcastically long pause> .....oh yeah, and the science :P Thanks, -Uplinger |
||
|
stevemtu
Cruncher Joined: Sep 7, 2005 Post Count: 12 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
For the first time since yesterday, I now have no tasks waiting to upload or download. Looking better.
|
||
|
Grumpy Swede
Master Cruncher Svíþjóð Joined: Apr 10, 2020 Post Count: 2167 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The validator doesn't seem to be interested in validating these larger WU's from batches 13345 - 41773. The previous lower batches were validated very soon after they were finished. No wingman, I'm _0 on all of these WU's, but the validator isn't interested in even trying to validate.
----------------------------------------Edit: Why does this "large" WU start at "job" #56? Never seen that before either. All other previous WU's always started at job #1 https://www.worldcommunitygrid.org/ms/device/...og.do?resultId=1657133303 These "new" typ of WU's, do have some very different behaviour. Edit 2: And the validator still doesn't try to validate them. I wonder if the validator is setup deliberately not to validate these "larger" WU's? [Edit 3 times, last edit by Grumpy Swede at Apr 27, 2021 2:50:46 PM] |
||
|
Jim1348
Veteran Cruncher USA Joined: Jul 13, 2009 Post Count: 1066 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Also, no worries on putting more stress on the system. That's what all this fun is about.... I can do that. |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The validator doesn't seem to be interested in validating these larger WU's from batches 13345 - 41773. The previous lower batches were validated very soon after they were finished. No wingman, I'm _0 on all of these WU's, but the validator isn't interested in even trying to validate. Edit: Why does this "large" WU start at "job" #56? Never seen that before either. All other previous WU's always started at job #1 https://www.worldcommunitygrid.org/ms/device/...og.do?resultId=1657133303 It shows starting at 56 because BOINC only uploads the last X bytes in the stderr to us. This limits what you see on the website. As for the validator doing well, I have bumped up to 8 cores and I'm monitoring where it is at. Thanks, -Uplinger |
||
|
|
![]() |