Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 315
|
![]() |
Author |
|
widdershins
Veteran Cruncher Scotland Joined: Apr 30, 2007 Post Count: 674 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The current error and invalid rate for this beta is low and looking good. We will not be adding any new work units until probably early next week. Hmm of the beta units I snagged 3 Errored out, 3 are in PV, and 4 are in progress. All on a machine that has a 100% reliable return rate normally. It seems like there might still be one or two wrinkles to iron out. |
||
|
Jean-David Beyer
Senior Cruncher USA Joined: Oct 2, 2007 Post Count: 338 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
"Checkpoints can be seen from the status entry for the workunit on the website.
----------------------------------------My previous post in this thread included one." All I see there is this. (Mine is the first one.) Project Name: beta Created: 05/29/2019 17:44:39 Name: BETA_ARP1_0000458_000 Minimum Quorum: 2 Replication: 2 Result Name OS type OS version App Version Number Status Sent Time Time Due / Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit BETA_ ARP1_ 0000458_ 000_ 1-- Linux 2.6.32-754.14.2.el6.x86_64 719 Valid 5/29/19 17:47:17 6/1/19 17:03:23 26.71 302.9 / 167.3 BETA_ ARP1_ 0000458_ 000_ 0-- Linux 4.4.0-148-generic 719 Valid 5/29/19 17:47:15 5/31/19 06:29:31 28.45 31.7 / 167.3 ![]() |
||
|
PMH_UK
Veteran Cruncher UK Joined: Apr 26, 2007 Post Count: 772 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Jean-David,
----------------------------------------Click on the status link, "valid", to see the Result Log that shows this. Any result returned shows the Result Log by clicking the Status link. Paul.
Paul.
|
||
|
BladeD
Ace Cruncher USA Joined: Nov 17, 2004 Post Count: 28976 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I had a page of Valids vs 1 Invalid vs 5 pending validations.
---------------------------------------- |
||
|
widdershins
Veteran Cruncher Scotland Joined: Apr 30, 2007 Post Count: 674 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I think the problem with the WU's that errored out may be FAH2 related. The machine that picked up the Betas has been on FAH2 duty and from looking back through the message log it seems the betas have been getting stopped and restarted a lot by FAH shoving it's way to the front due to the quick return times those get sent out with.
Another issue might be how memory hungry the FAH2 units are I saw a couple of FAH units voluntarily suspending themselves due to memory constraints and then restarting again once memory freed up. So perhaps the betas have difficulties with other units competing for resources forcing suspending and restarting. For the record though, zika units also have to deal with FAH2 elbowing them aside on that box and none of those errored out. |
||
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The current error and invalid rate for this beta is low and looking good. We will not be adding any new work units until probably early next week. Can you supply an approximate time next week when more beta work will be loaded? I'm getting ready to leave on vacation and would like to have machines doing work before I leave if possible. I was late to the party and missed all of the first batch. Thanks.
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
![]() ![]() |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1323 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I got a resend from the first WU's sent with a 4 days deadline.
BETA_ ARP1_ 0000078_ 000_ 3-- rekendoos1 In Progress 6/2/19 13:34:13 6/3/19 23:10:12 = 33.6 hours I'll not make that deadline. These BETA's on that machine have a run time from 47.75 hours up to 52.87 hours. |
||
|
Jim1348
Veteran Cruncher USA Joined: Jul 13, 2009 Post Count: 1066 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'll not make that deadline. These BETA's on that machine have a run time from 47.75 hours up to 52.87 hours. What is "that machine"? WCG unfortunately does not allow us to see it. |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1323 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'll not make that deadline. These BETA's on that machine have a run time from 47.75 hours up to 52.87 hours. What is "that machine"? WCG unfortunately does not allow us to see it. It's an AMD Opteron 6172 total 48 cores, where I'm running 2 non-throttled VM's. Mentioned BETA is running on the Windows one. 30 cores Windows 7: Measured floating point speed 2578 million ops/sec Measured integer speed 6524 million ops/sec 14 cores Linux: Measured floating point speed 3342.81 million ops/sec Measured integer speed 8770.31 million ops/sec |
||
|
Jean-David Beyer
Senior Cruncher USA Joined: Oct 2, 2007 Post Count: 338 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
"Click on the status link, "valid", to see the Result Log that shows this.
----------------------------------------Any result returned shows the Result Log by clicking the Status link." OK; I infer checkpoints were taken, but where in the 7734 did they get those times in April 2018? <![CDATA[ <stderr_txt> INFO: No state to restore. Start from the beginning. [20:13:22] INFO: Checkpoint taken at 2018-04-01_06:00:00 [00:29:27] INFO: Checkpoint taken at 2018-04-01_12:00:00 [04:37:47] INFO: Checkpoint taken at 2018-04-01_18:00:00 [07:16:48] INFO: Checkpoint taken at 2018-04-02_00:00:00 [09:55:18] INFO: Checkpoint taken at 2018-04-02_06:00:00 [19:09:46] INFO: Checkpoint taken at 2018-04-02_12:00:00 [16:10:53] INFO: Checkpoint taken at 2018-04-02_18:00:00 [18:47:54] INFO: Checkpoint taken at 2018-04-03_00:00:00 18:49:32 (2774): called boinc_finish(0) </stderr_txt> ]]> ![]() |
||
|
|
![]() |