Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Human Proteome Folding - Phase 2 Thread: Validation of HPF2 work units stopped!? |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 11
|
Author |
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1842 Status: Offline Project Badges: |
Noticed that there haven't been any WUs begin credited since yesterday and the list of WU with status "pending validation" keeps increasing...
----------------------------------------Any info if this is just a weekend glitch or something more serious? Ralf |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Noticed that there haven't been any WUs begin credited since yesterday and the list of WU with status "pending validation" keeps increasing... Any info if this is just a weekend glitch or something more serious? Ralf Maybe for you, but the midday stats show the normal Monday weekly peak and yesterdays total validations were within range of total WCG. See chart for daily detail through noon: http://bit.ly/WCGPF2 If your WU's are not validating, then either the minimum returned is less than 15 for a task [each copy unique] and maybe show a "try validation", if you click on the work unit name in the list of the result status page. Maybe post an example of the detail quorum would give us insight. --//-- |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1842 Status: Offline Project Badges: |
Noticed that there haven't been any WUs begin credited since yesterday and the list of WU with status "pending validation" keeps increasing... Any info if this is just a weekend glitch or something more serious? Ralf Maybe for you, but the midday stats show the normal Monday weekly peak and yesterdays total validations were within range of total WCG. The list of WU's with validation pending has grown at the same time from roughly a dozen that have been in the queue for me at any given time in the past, to more than 30, most of them HPF2 WU's, hence my question... Ralf |
||
|
gb009761
Master Cruncher Scotland Joined: Apr 6, 2005 Post Count: 2955 Status: Offline Project Badges: |
If you've just switched over from HCC to HPF2, then the requirements for validation are different. With HCC, it only needs you and your wingman to return their WU's before a WU can have an attempt at being validated, whilst on the other hand, for HPF2, you and 14 other copies of that WU have to have been returned before a validation attempt is made.
----------------------------------------Therefore, as SekeRob has stated, if you can demonstrate (i.e., by using 'cut & paste'), a HPF2 WU that has at least 15 copies returned before the last Stats update (at 12:00 UTC today), and which hasn't been validated, then we've got something to work with. |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1842 Status: Offline Project Badges: |
If you've just switched over from HCC to HPF2, then the requirements for validation are different. With HCC, it only needs you and your wingman to return their WU's before a WU can have an attempt at being validated, whilst on the other hand, for HPF2, you and 14 other copies of that WU have to have been returned before a validation attempt is made. I did not switch anything. Therefore, as SekeRob has stated, if you can demonstrate (i.e., by using 'cut & paste'), a HPF2 WU that has at least 15 copies returned before the last Stats update (at 12:00 UTC today), and which hasn't been validated, then we've got something to work with. I am running HCC since January, added HPF2 about a month ago after HFCC finished. This morning was the first time since I added HPF2 that no HPF2 WUs have been credited at all (I am runing WCG jobs on 15 computers of mine, with about 5 credited HPF2 WUs credited on average each half day update) and instead the list of PV WU's increased at the same time... Ralf |
||
|
gb009761
Master Cruncher Scotland Joined: Apr 6, 2005 Post Count: 2955 Status: Offline Project Badges: |
Ralf, as requested already, do you have an example of a HPF2 WU that meets/exceeds the 'ready to attempt validation' requirement? (i.e., a WU that has at least 15 copies in a Pending Validation state).
----------------------------------------It IS possible, for a lot (if not all) of your HPF2 WU's to only need 1 or 2 more copies to be returned - and thus, none actually have met the 'ready to attempt validation' state. |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1842 Status: Offline Project Badges: |
Ralf, as requested already, do you have an example of a HPF2 WU that meets/exceeds the 'ready to attempt validation' requirement? (i.e., a WU that has at least 15 copies in a Pending Validation state). I am not sitting all day long in front of the computer to watch numbers tick, I have work to do and donate spare CPU time on a number of computers I operate to these kind of projects. I just check usually twice a day, breakfast and dinner, on what's going on...It IS possible, for a lot (if not all) of your HPF2 WU's to only need 1 or 2 more copies to be returned - and thus, none actually have met the 'ready to attempt validation' state. Just picked the last/first WU, which sits the longest in the PV queue, "om816_00030", which has 14 WUs showing "pending validation", 1 WU "Error" (another thing that happened since last week several times) and 5 show "in progress". I understand how this all works, and there always have WUs been queued awaiting validation, sometimes for as long as week. But never has it happened, with any of the WCG projects that I have been running, that nothing at all has been credited at an stats update. Given that there has been always something going on with any of the distributed computing projects (don't get me started on Rosetta@Home or SETI@Home!), I was just curious if there was a similar issue now with HPF2. It wouldn't be the first time that a server barfs, why should it not be possible that the one validating HPF2 WUs took a "timeout" over the weekend and needs a "kick in the pants" on Monday morning... Ralf |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
There are strange mechanics with HPF2 that have to do with folks stopping their computers on Friday [with unfinished tasks on them] and starting them up on Monday. The large minimum quorum of 15 amplifies this effect. Usually there is second race of validations on Monday afternoon and Tuesday, so plz relax ,... it will come and eventually the systems will force them through the pipe. For the moment there is no wider sign of HPF2 validators stalling.
----------------------------------------Thanks for contributing your spare cycles to research and the common good. Much appreciated. --//-- [Edit 1 times, last edit by Former Member at Jun 20, 2011 5:27:34 PM] |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1842 Status: Offline Project Badges: |
There are strange mechanics with HPF2 that have to do with folks stopping their computers on Friday [with unfinished tasks on them] and starting them up on Monday. The large minimum quorum of 15 amplifies this effect. Usually there is second race of validations on Monday afternoon and Tuesday, so plz relax ,... it will come and eventually the systems will force them through the pipe. For the moment there is no wider sign of HPF2 validators stalling. My point and reason for asking is/was that there has never been such a "lag" before, there always has been a slow but steady stream of WUs being queued, validated and credited. And I do not run HPF2 since Friday, so when this morning, after a weekend, where it is likely that no admin kept an eye on the server(s) running the project, I noticed a sudden, unusual stop in the pattern of queuing, validating, crediting, I dared to ask...7 hours until the next update, let's see what happens... Ralf |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Techs get auto alerts on their monitors and even on their cells, if things are stalling or system loads go wacky for longer.
There's a couple of big players and if they drew large swats of work, it could land on you. I had CEP2 building up big time over the weekend and then in the last 18 hours a bunch of them came through. I tested a few on the Result Status page WU detail, but none had a complete quorum, so was at ease with that. Sorry if you know all this, but we also try to keep the exchange understandable for others less well in with the mechanics for the occasional silent ''oh'' ;>) --//-- |
||
|
|