Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 8
|
![]() |
Author |
|
robertmiles
Senior Cruncher US Joined: Apr 16, 2008 Post Count: 443 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Is there a problem with this workunit?
2/1/2009 5:05:01 PM|World Community Grid|Restarting task mf189_00038_13 using hpf2 version 603 It's already run over 23 CPU hours on my machine, and BOINC manager says that it's expected to take over 14 CPU hours more. However, its progress appears stuck at 50.807%. I don't think I've seen any of the proteome workunits take more than perhaps 24 CPU hours before. Also, it seems to be taking more time without suspending than usual - I have BOINC set to decide which workunit gets the next timeslice every two hours, but it looks like this one's been running for at least 19 CPU hours without giving any other workunit a timeslice. |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
robertmiles, just stop the BOINC service / exit BOINC and restart. It's near guaranteed to resume a few percentage points lower and under loss of time cycling in the loop. After that it will quickly pass that threshold of 50.87% and finish. Old case one or the other HPF2 cruncher incurs.
----------------------------------------
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
robertmiles
Senior Cruncher US Joined: Apr 16, 2008 Post Count: 443 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Is there a problem with this workunit? 2/1/2009 5:05:01 PM|World Community Grid|Restarting task mf189_00038_13 using hpf2 version 603 It's already run over 23 CPU hours on my machine, and BOINC manager says that it's expected to take over 14 CPU hours more. However, its progress appears stuck at 50.807%. I don't think I've seen any of the proteome workunits take more than perhaps 24 CPU hours before. Also, it seems to be taking more time without suspending than usual - I have BOINC set to decide which workunit gets the next timeslice every two hours, but it looks like this one's been running for at least 19 CPU hours without giving any other workunit a timeslice. I've manually suspended it for long enough for another workunit to start getting a timeslice. I don't know if it makes a difference that I've recently lowered the percentage of CPU time for BOINC from 100% to 98% in order to help look for a problem in Ralph@home workunits. |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
That wont work if you have "Leave application in memory when pre-empted" on. The science needs to unload to get it to resume from the last good checkpoint.
----------------------------------------FAQ for those who do not wish to follow the short instruction: http://worldcommunitygrid.org/forums/wcg/viewthread?thread=16378 cheers
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
robertmiles
Senior Cruncher US Joined: Apr 16, 2008 Post Count: 443 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Just suspending for few minutes didn't help, so I suspended everything and rebooted. It's now at 50.765% progress, with only about 5 CPU hours CPU time reported as used so far, and less than 6 CPU hours estimated as needed for completion. I'll watch it to see if this problem happens again.
|
||
|
robertmiles
Senior Cruncher US Joined: Apr 16, 2008 Post Count: 443 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
That wont work if you have "Leave application in memory when pre-empted" on. The science needs to unload to get it to resume from the last good checkpoint. FAQ for those who do not wish to follow the short instruction: http://worldcommunitygrid.org/forums/wcg/viewthread?thread=16378 cheers I have that option turned on. Looks like the reboot procedure I used is an even more thorough, but easier to remember, method of doing essentially the same thing to BOINC workunits. |
||
|
rilian
Veteran Cruncher Ukraine - we rule! Joined: Jun 17, 2007 Post Count: 1453 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
i think there is some info about such WUs in this topic
----------------------------------------http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=22340 |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Good news from the early warning systems, the writer of BOINCTasks has put a progress monitor function on his ToDo list. You can when implemented then set for instance 30 minutes run time coupled to 0.2% progress minimum, whilst CPU time consumption remains normal. BOINCTasks [v 0.42] already has a warning function for low CPU i.e. a job running at 100% setting and hardly any being used.
----------------------------------------
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
![]() |