Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 29
|
![]() |
Author |
|
robertmiles
Senior Cruncher US Joined: Apr 16, 2008 Post Count: 443 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Something about one of these workunits seems to be causing problems
----------------------------------------with the progress calculations for both of them: 7/26/2008 3:03:24 PM|World Community Grid|Resuming task faah5007_1bwb_6fiv_00_1 using faah version 605 7/26/2008 12:36:22 PM|rosetta@home|Starting t498__BOINC_SYMMETRY_D2SYMM_FOLD_AND_DOCK_RELAX-t498_-_4244_11729_0 7/26/2008 12:36:24 PM|rosetta@home|Starting task t498__BOINC_SYMMETRY_D2SYMM_FOLD_AND_DOCK_RELAX-t498_-_4244_11729_0 using rosetta_beta version 598 Both of them have the CPU time approximately equal to the To completion time, yet one of them shows progress of 29.727% and the other shows a progress of 39.706%. For the faah workunit, both the CPU time and the To completion time are around 16 hours. There doesn't seem to be any other problem with letting both of them continue for now, so that's what I'll try. BOINC doesn't seem to give me any way to copy lines from the Tasks screen, or I'd show the results of the Progress calculation in more detail. [Edit 1 times, last edit by robertmiles at Jul 26, 2008 11:22:11 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello robertmiles,
A number of members have recently commented that the new FAAH work units are much longer. Just let them run as long as they show progress. Lawrence |
||
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hello robertmiles, A number of members have recently commented that the new FAAH work units are much longer. Just let them run as long as they show progress. Lawrence They also have a WU name starting with "faah499x_" or "faah500x_", so they seem to be part of an experiment not yet listed on http://fightaidsathome.scripps.edu/status . That page was last updated 25/6/08. The last experiment described there is #22, with WUs 4417-4622. I don't know about other crunchers, but I like to be kept informed on the progress of the science projects I crunch. Provided the scientists put forward a plausible case that the work is worthwhile, I expect that keeping us more up to date would help reduce the cruncher drop-out rate and help in recruiting new crunchers. |
||
|
Grendel90
Advanced Cruncher Wales, UK Joined: Jun 8, 2007 Post Count: 54 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
One of the units I have for FAAH is 5004_1hps_1mtr_00_1 - I've noticed that it seems to 'jump around' percentage wise... Last night it was on 74.523%, this morning 33.679%, now on 93.898%!
----------------------------------------Something wrong with it? All other work units are progressing as expected. Guess I'll just leave it running and hope for the best... though I'd have hoped for it to be finished by the time it reached 30-odd hours - easily my longest unit so far! ![]() |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
One of the units I have for FAAH is 5004_1hps_1mtr_00_1 - I've noticed that it seems to 'jump around' percentage wise... Last night it was on 74.523%, this morning 33.679%, now on 93.898%! Something wrong with it? All other work units are progressing as expected. Guess I'll just leave it running and hope for the best... though I'd have hoped for it to be finished by the time it reached 30-odd hours - easily my longest unit so far! Very hard to understand that if running uninterrupted without a boot or hibernation in between that a unit goes back from 74% to 33%... sure it was the same unit and not a new job fetched overnight? (See Result Status pages). That's said some of these jobs, non-deterministic as they are, show very odd progress behaviour and long as you're now near 100%, things should be considered okay.
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Jul 27, 2008 11:35:06 AM] |
||
|
Grendel90
Advanced Cruncher Wales, UK Joined: Jun 8, 2007 Post Count: 54 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
One of the units I have for FAAH is 5004_1hps_1mtr_00_1 - I've noticed that it seems to 'jump around' percentage wise... Last night it was on 74.523%, this morning 33.679%, now on 93.898%! Something wrong with it? All other work units are progressing as expected. Guess I'll just leave it running and hope for the best... though I'd have hoped for it to be finished by the time it reached 30-odd hours - easily my longest unit so far! Very hard to understand that if running uninterrupted without a boot or hibernation in between that a unit goes back from 74% to 33%... sure it was the same unit and not a new job fetched overnight? (See Result Status pages). That's said some of these jobs, non-deterministic as they are, show very odd progress behaviour and long as you're now near 100%, things should be considered okay. Well it finally finished, claiming some 403.5 in credit. Easily the largest and longest unit I've ever had. I can't think why it jumped about, no reboots and sitting watching it I noticed it kept rolling back the percentage - almost as if it was rechecking it's calculations. Oh well onwards and upwards! And thanks foe the help! ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
hello Grendel90,
sitting watching it I noticed it kept rolling back the percentage - almost as if it was rechecking it's calculations That is what all FAAH work units do, to some extent. They make a guess and put the n(th) atom in a position, then they let the other atoms move where seems best - - until they suddenly discover that a later atom just HAS to go where another atom already is. Then they go all the way back to the original n(th) atom and put it in another position, since the first guess was illegal. They unwind all the progress accrued after the n(th) atom was originally placed and start over again. That said, it is still unusual to see them lose more than 4% or so at one time. The calculations are non-deterministic, but they usually catch illegal moves fairly quickly. Lawrence |
||
|
Grendel90
Advanced Cruncher Wales, UK Joined: Jun 8, 2007 Post Count: 54 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
hello Grendel90, sitting watching it I noticed it kept rolling back the percentage - almost as if it was rechecking it's calculations That is what all FAAH work units do, to some extent. They make a guess and put the n(th) atom in a position, then they let the other atoms move where seems best - - until they suddenly discover that a later atom just HAS to go where another atom already is. Then they go all the way back to the original n(th) atom and put it in another position, since the first guess was illegal. They unwind all the progress accrued after the n(th) atom was originally placed and start over again. That said, it is still unusual to see them lose more than 4% or so at one time. The calculations are non-deterministic, but they usually catch illegal moves fairly quickly. Lawrence I used to just leave WGC totally minimised so I'd never noticed, it was only because this one unit was taking so long that I started checking it. Thanks for letting me know that it was just doing it's thing and that there's nothing wrong with my system! Thanks for letting me know! And to ![]() |
||
|
robertmiles
Senior Cruncher US Joined: Apr 16, 2008 Post Count: 443 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Something about one of these workunits seems to be causing problems with the progress calculations for both of them: 7/26/2008 3:03:24 PM|World Community Grid|Resuming task faah5007_1bwb_6fiv_00_1 using faah version 605 7/26/2008 12:36:22 PM|rosetta@home|Starting t498__BOINC_SYMMETRY_D2SYMM_FOLD_AND_DOCK_RELAX-t498_-_4244_11729_0 7/26/2008 12:36:24 PM|rosetta@home|Starting task t498__BOINC_SYMMETRY_D2SYMM_FOLD_AND_DOCK_RELAX-t498_-_4244_11729_0 using rosetta_beta version 598 Both of them have the CPU time approximately equal to the To completion time, yet one of them shows progress of 29.727% and the other shows a progress of 39.706%. For the faah workunit, both the CPU time and the To completion time are around 16 hours. There doesn't seem to be any other problem with letting both of them continue for now, so that's what I'll try. BOINC doesn't seem to give me any way to copy lines from the Tasks screen, or I'd show the results of the Progress calculation in more detail. The rosetta_beta workunit already finished, with a work time of 21,643.92 seconds, with no apparant problem other than a poor ratio of granted credit to requested credit. The faah workunit is still running, with over 40 hours CPU time so far and the estimated time to completion only down to around 10 hours. If I remember correctly, the estimated time for this workunit started out as about 15 hours, so part of the problem could be that the estimated time was much lower than it should be. |
||
|
robertmiles
Senior Cruncher US Joined: Apr 16, 2008 Post Count: 443 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The faah workunit is still going, with 51 hours CPU time, 96% progress, and 2 hours to completion.
|
||
|
|
![]() |