Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 11
|
![]() |
Author |
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
This looks like a bad Type A WU: ts05_b001_ps0000
----------------------------------------Details: First 3 copies encountered Error 29 (0x1d) and quit after many hours. Copies _3 is still "In Progress". [Edit #3]: Copy _3 has run to completion and is PV (!) I suspended my copy after 6h07m at 27.1% Progress. This would have prevented a new copy being despatched until the techs intervened, or until it timed out on 21 Apr. [Edit #1]: I aborted the WU. uplinger (Changes to distribution of error work units) said that they are changing the no of errors after which no more copes of a WU will be sent, from 5 to 3. This one has had 3 errors so far, so I've tested the changes. [Edit #2]: Copy _5 has now been sent out to some poor unsuspecting cruncher. The decreased max error count isn't working for WUs that were started before uplinger's changes. [end edits] The result logs from the 3 copies that completed seem normal up to the point where the exit happened, except for the error message in the header. For example, here are the head and tail of the log for copy _1: ------ Result Name: ts05_ b001_ ps0000_ 0-- <core_client_version>5.10.30</core_client_version> <![CDATA[ <message> The system cannot write to the specified device. (0x1d) - exit code 29 (0x1d) </message> <stderr_txt> .416000 wcgStepsDone = 4100 wcgSteps1 = 5000 wcgCyclesDone = 20 wcgCycles = 50 pctComplete = 0.416400 ... <omitted section here> ... wcgStepsDone = 2000 wcgSteps1 = 5000 wcgCyclesDone = 34 wcgCycles = 50 pctComplete = 0.688000 Encountered error. Exiting. </stderr_txt> ]]> --------- Here are extracted details of copies _0, _1 and _3. Columns are: Name | Extracts_from_result_log | CPU_Time | Claimed/Awarded ts05_b001_ps0000_0 | exit code 29, wcgCycles = 50 pctComplete = 0.688000; Encountered error. Exiting. | 22.10 | 384.3 / 0.0 ts05_b001_ps0000_1 | exit code 29, wcgCycles = 50 pctComplete = 0.447600; Encountered error. Exiting. | 43.14 | 347.3 / 0.0 ts05_b001_ps0000_2 | exit code 29, wcgCycles = 50 pctComplete = 0.688000; Encountered error. Exiting. | 21.13 | 387.3 / 0.0 [OT] Wish this <adjective omitted> forum software could do tables or at least allowed table stops (aka tabs)! [Edit 3 times, last edit by Rickjb at Apr 18, 2010 9:13:33 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
[Edit #2]: Copy _5 has now been sent out to some poor unsuspecting cruncher. The decreased max error count isn't working for WUs that were started before uplinger's changes. [end edits] [snip] [OT] Wish this <adjective omitted> forum software could do tables or at least allowed table stops (aka tabs)! To the former, see https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,28881#276494 All of those were sent out after Uplinger's post. To the latter, use the 'Code' tags - that preserves tabs.
Of course, you have to paste them in. But click and drag on them... they're still tabs. :) Some board software makes it clear when code tags are used, offsetting it in a 'quote' type box or inset, and/or actually putting the word "Code" before it; mvnForum makes the use of 'code' tags practically seamless. |
||
|
I need a bath
Senior Cruncher USA Joined: Apr 12, 2007 Post Count: 347 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
we have already established that these exit code 29 wu are not "bad". The so-called error is actually a scientifically useful result. Furthermore credit will be granted, so don't abort them.
----------------------------------------![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
we have already established that these exit code 29 wu are not "bad". The so-called error is actually a scientifically useful result. I don't think this is always valid, if someone did something like what mweisensee has done here: https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,28891#276541 Edit: Removed part of the quote to eliminate confusion. [Edit 1 times, last edit by Former Member at Apr 18, 2010 5:11:56 AM] |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
What mweisensee did may not have been the genuine article [the tolerances for CPDN are designed to do that stuff]. I've had several cases where the resume was from zero percent but the run time was not lost and ended up showing near double from normal and credit hacked in half and valid.
----------------------------------------At least the one set I had with 4 error returns and 1 server abort showed 3 with the same error at 0.2384 in the log and dramatic different run times. The 4th could have been a restart and had 2.5 times the run time of the other 3 when going down at 0.5016. Anyway, the announced drought for 3 weeks is not for nothing. Good morning world. edit: added progress value for 4th result.
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 2 times, last edit by Sekerob at Apr 18, 2010 10:48:38 AM] |
||
|
sk..
Master Cruncher http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif Joined: Mar 22, 2007 Post Count: 2324 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I suspect this project is taking up a disproportionately high amount of technical & scientific time by the WCG team.
3 weeks to think it over perhaps? It is now defined as an Intermittent project; no more semi status. So perhaps it has already been decided that it will not be allowed to interfere with other existing and planned projects? |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Think the thinking is over at UTMB... anyway, we long had your message extensively. You're not forced to contribute time towards this science and the intermittent was stated up front, even with the active label on it... because the scientists need to work on the output maybe? And if all this effort results in 1 or 2 strongly promising compounds, then the mission will have been a resounding success no matter how we got there.
----------------------------------------Now, hold on to your boots... science takes time.
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Apr 18, 2010 10:56:00 AM] |
||
|
ThreadRipper
Veteran Cruncher Sweden Joined: Apr 26, 2007 Post Count: 1321 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I don't know if the WU is bad or what happened, but the WU ts05_b179_ps0000 was finished after avout 30 hours but in the results page it says "Too late". I got it the 14th af April and it said that deadline was 26th. Today is 22nd.
----------------------------------------Very strange. Also when I look at the quorum every other copy of this one was marked with "Error"... maybe someone should have a look at it? ![]() Join The International Team: https://www.worldcommunitygrid.org/team/viewTeamInfo.do?teamId=CK9RP1BKX1 AMD TR2990WX @ PBO, 64GB Quad 3200MHz 14-17-17-17-1T, RX6900XT @ Stock AMD 3800X @ PBO AMD 2700X @ 4GHz |
||
|
boulmontjj
Senior Cruncher France Joined: Nov 17, 2004 Post Count: 317 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I don't know if the WU is bad or what happened, but the WU ts05_b179_ps0000 was finished after avout 30 hours but in the results page it says "Too late". I got it the 14th af April and it said that deadline was 26th. Today is 22nd. Very strange. Also when I look at the quorum every other copy of this one was marked with "Error"... maybe someone should have a look at it? Same for mine. It finished ok and retunr before the dead line but has been marked "too late". And every other copiesfinished in error. |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
From past observation, if the distribution is stopped [for reasons such as here too many errors], the later returned results, if ending normal get marked Too Late [lmaybe lost in froglation http://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,27775 ]. Yes, they will convert to getting credit for all the parts, same as were it a regular unit, these trop tard as boulmontjj posted in other thread:
---------------------------------------- Nom du résultat App Version Number Etat Heure d'envoi Heure de retour prévue / Heure de retour Temps d'unité centrale (heures) Crédit BOINC demandé/accordé ts05_ b039_ ps0000_ 4-- 612 Erreur 18/04/10 22:21:11 22/04/10 00:56:35 25,44 417,7 / 417,7 ts05_ b039_ ps0000_ 3-- 612 Erreur 17/04/10 19:04:35 18/04/10 22:21:09 16,73 282,0 / 282,0 ts05_ b039_ ps0000_ 2-- 612 Trop tard 15/04/10 18:09:02 21/04/10 14:33:32 49,72 757,2 / 757,2 < This one ts05_ b039_ ps0000_ 1-- 612 Erreur 14/04/10 20:14:58 15/04/10 18:08:53 10,56 183,5 / 183,5 ts05_ b039_ ps0000_ 0-- 612 Erreur 14/04/10 20:14:56 17/04/10 18:41:13 22,73 388,3 / 388,3
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Apr 22, 2010 7:21:27 PM] |
||
|
|
![]() |