Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 65
|
![]() |
Author |
|
Vester
Senior Cruncher USA Joined: Nov 18, 2004 Post Count: 325 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have some that are pending validation although someone else has completed the same WU. In all pending cases with two completions, there is a large disparity in completion times.
----------------------------------------![]() |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2153 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have some that are pending validation although someone else has completed the same WU. Same here, having many ARP1-tasks Pending Validation while their wingmen do, too. None are Pending Verification.In all pending cases with two completions, there is a large disparity in completion times. I'd say not in all cases. Examples from my results, selected from the output of the command wcgstats -wrrr -sP -aARP1:
Anyhow, you get the idea … ![]() |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 945 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
A recent addition to my set of BOINC data collection scripts looks for validation issues; a sample item from it is shown below.
----------------------------------------ARP1_0014435_129 (WU 155120082) was created 2022-08-25T14:27:14+0000 The script produces a summary thus (example is for 26th August returns): Units validated: 6 Note that "Units In Progress" refers to items where my result is at Pending Validation but there's at least one other result marked as In Progress... The total since 26th August (excluding work returned today) is as below (correct as at about 21:30 UTC on 1st September.) : Units validated: 20 It appears that (as at the time of posting) there hasn't been a successful validation for any work unit created after about 13:45 UTC on 25th. All the validated units were older than that (some of them considerably so because of the number of No Reply and Not Started by Deadline tasks ARP1 keeps getting...) I wonder if there will be a sudden flushing out of one day's worth of these when they've been stalled for 6 days (the same interval as the deadline) -- that is used by the transitioner as an inactivity retry time so they may get a kick then. Whether it'll be sufficient to get them past the blockage is another matter :-( Cheers - Al. [Edit 2 times, last edit by alanb1951 at Sep 1, 2022 10:29:16 PM] |
||
|
MJH333
Senior Cruncher England Joined: Apr 3, 2021 Post Count: 266 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() |
Al,
I’ve also got some where my result is Pending Validation and one or more results have errored out (or no reply) and the next wingman is “Waiting to be sent”. See e.g. https://www.worldcommunitygrid.org/contribution/workunit/155119949 Do you think that is a related issue? Cheers, Mark |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 945 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Al, Mark,I’ve also got some where my result is Pending Validation and one or more results have errored out (or no reply) and the next wingman is “Waiting to be sent”. See e.g. https://www.worldcommunitygrid.org/contribution/workunit/155119949 Do you think that is a related issue? Cheers, Mark I had the precursor to one of those show up overnight - my success return and one failure return and no retry! And yes, it is related, but tied to the transitioner rather than the validator. This is not really the place for a [long] "how the transitioner works" piece, so I'll just cite an example of what can happen... When MilkyWay@home had their disk crash and subsequent excess work unit generation it was common to see work units needing retries that either had "Waiting to be sent" or didn't seem to have a retry readied at all.. What was happening was that there was such a long queue of items waiting for transitioner access that requests to generate a retry and/or submit the retry to the feeder weren't being seen in the usual timely fashion. The transitioner has a built-in defence mechanism it brings into play at the end of looking at a request; if the new "transitioner time" would be in the past (and yes, if there's a backlog of any type it can happen!) it alters that time to push it into the future; unfortunately, the further in the past the request would have been, the further it shifts it into the future! So if there's a genuine backlog, processed items could be pushed as far as a day into the future (which added to delays when MilkyWay was in difficulties ...) As far as I know, MilkyWay only run one transitioner, but i suspect WCG run multiple transitioners -- if any of those have crashed and failed to restart, a certain portion of work will not be able to advance, so that could be a reason for problems; otherwise, they may need more transitioners :-) So it looks as if WCG may need some down time to either clear a transitioner backlog or reconfigure to run more transitioners; turning off various work unit generators instead might help, but it would only be a temporary fix... And let's just hope that there isn't an over-loaded database and//or file-store access problems at the back of this all - if there is, we may be in for work-unit rationing... :-( Cheers - Al. [Edit; to add remark about database/filestore overload...] [Edit 1 times, last edit by alanb1951 at Sep 2, 2022 1:24:14 PM] |
||
|
MJH333
Senior Cruncher England Joined: Apr 3, 2021 Post Count: 266 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() |
Al,
Many thanks for the explanation. Cheers, Mark |
||
|
MyrCu
Cruncher Joined: Apr 9, 2020 Post Count: 43 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() |
I received a few ARP in the last days, together 32. All I sent back between 27. to 29. August are still "pending". Only one is "Valid" (1. Sept, 9:10), two other are pending, wich were sent back on 1. or 2. September.
In the last one or two days i didn't receive any more ARPs. |
||
|
sptrog1
Master Cruncher Joined: Dec 12, 2017 Post Count: 1574 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Do I understand that successful work is being held up by validation issues and it is not just a matter of credit being issued?
|
||
|
D_S_Spence
Advanced Cruncher Canada Joined: Jan 5, 2017 Post Count: 107 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I don't know if another example helps anything, but I have one here:
ARP1_0021214_128 https://www.worldcommunitygrid.org/contribution/workunit/155155979 |
||
|
geophi
Advanced Cruncher U.S. Joined: Sep 3, 2007 Post Count: 102 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Do I understand that successful work is being held up by validation issues and it is not just a matter of credit being issued? Yes. You understand correctly. I have completed about 20 ARP task where two or more computers have finished the tasks in each work unit and all are pending validation. |
||
|
|
![]() |