Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: The Clean Energy Project - Phase 2 Forum Thread: Newwork units resetting back to 0% complete after power down |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 9
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Can anyone tell me why the new CEP2 work nits reset to 0% work completed after the computer powers down?
This is something new I have not seen before. The MMC and FAAH uits went back some but the new CEP2 work units went from 87% completed all the way back to 0%... Whats up with that? Robert |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
As long as no checkpoint is made, yes, the task will resume from zero percent. The cep2 sub-jobs were reordered to run the longest at the start, so know what you have to do. Check task properties to verify a task has checkpointed or switch on the checkpoint_debug logging/printing in the event log to choose a good moment to power cycle. And when doing one, best to exit boinc core agent completely prior to doing that. If the os is too quick, cep2 might not have enough time to save the intermediate results properly.
|
||
|
PMH_UK
Veteran Cruncher UK Joined: Apr 26, 2007 Post Count: 741 Status: Offline Project Badges: |
Robert,
----------------------------------------Work resumes at the last checkpoint but for some tasks this is infrequent. CEP2 can go hours before checkpointing, especially first. Suggest use hibernate if you need to power off or wait for checkpoint if needing to re-boot. Paul.
Paul.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Thanks for both responses.... had to power down since my RAM chips came in.
|
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 11791 Status: Offline Project Badges: |
87% seems high for you not to have reached a check-point. Was your system slow? What processor and RAM were you using and how much RAM have you added?
Mike |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The cep2 progress is a proration against the maximum runtime of 18 hours, not how many jobs in the package were processed, but this could have changed, although, there was no new app version launched with for the latest library, yet.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Okay I can't Quote everyone so here goes
Older 2001 Lenovo duel CPU (not software split) 1.67GHz had 2GB RAM now at 3GB I could of been wrong on the 87% that could of been a FAAH unit. But noticed later it didn't checkpoint until about 60%. The CEP2 I ran failed twice on other systems then mine is pending validation but went over 18 hours and kicked out at 98% done.... whatever that shaped me. Back to welding Robert |
||
|
Speedy51
Veteran Cruncher New Zealand Joined: Nov 4, 2005 Post Count: 1220 Status: Offline Project Badges: |
The thing you were describing about the 18 hour cut off point and the task had reached 98% complete this has happened to a few people. I can understand how annoying it is however this is perfectly normal behavior. Completely out of interest can you remember how many jobs had the task completed when it reached the hard limit
---------------------------------------- |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 11791 Status: Offline Project Badges: |
Robert
Your system is fairly slow by modern standards so I would suggest that you hold off any shutdown or restart until you finish a cep2 unit. Your extra RAM should enable you to run a second unit simultaneously but I would suggest that you don't enable that feature because that would mean having 2 units at different stages so you wouldn't know when you could close down. Mike |
||
|
|