Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 18
|
![]() |
Author |
|
Yarensc
Advanced Cruncher USA Joined: Sep 24, 2011 Post Count: 136 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
In the <options> section in the cc_config file you could try adding or increasing the <start_delay> tag
|
||
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I've recently managed to set the oldest of my machines to hibernate automatically, using Windows Task Scheduler.
And yes. cleanenergy, it does run all night! It hibernates M-F 2pm-8pm when my electricity costs over AU$0.55/kWh. It runs 2 x CEP2 WUs and 2 x FAAH. It seems to restart OK most of the time, but a few days ago, the boinc client itself somehow crashed and restarted during the Windows resume process. (No "no heartbeat" messages, so I'm a bit off-topic here). It has occurred to me that some startup/resume problems could be happening because on multi-core/processor systems, BOINC starts/resumes all tasks simultaneously. Apart from O/S resource and disk contention issues, this may also cause CPU core voltage undershoot fluctuations on motherboards with poor CPU voltage regulation, such as the one on my machine in question. Similtaneously trying to restart multiple CEP2 WUs could be causing the "no heartbeat" exits that are the subject of this thread, especially if some of the WUs have not done their early first checkpoint so that they are trying to create 6499 data files each. I would like to be able to try a BOINC option to insert a delay between starting/restarting sucessive tasks - like <startup_delay>, but <between_tasks_delay>. Such a feature would also have been extremely useful for GPU-HCC. (Yeh, belongs in BOINC or Suggestions/Feedback section). |
||
|
Speedy51
Veteran Cruncher New Zealand Joined: Nov 4, 2005 Post Count: 1288 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have noticed with some of the bigger jobs or example E214699_051_C.36.C29H17N3OSSi2.01535098.4.set1d06f ran until the hard stop point, even if I put my computer into hibernation mode when my computer comes out of hibernation the task starts from the last checkpoint. I lost several hours of computing time due to this the other day. Normally after my computer resumes from hibernation the task resumes from the end that point where the computer was put into hibernation.
----------------------------------------![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Dear Speedy51,
what *may* be happening is that your computer runs out of RAM while WCG is hybernating and that the sleeping CEP jobs are discarded to recover some space in RAM. That would at least be a possibility and consistent with bigger jobs, more RAM use, and thus higher chance of need to discard sleeping state from it. Best wishes Your Harvard CEP team |
||
|
Speedy51
Veteran Cruncher New Zealand Joined: Nov 4, 2005 Post Count: 1288 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Dear Speedy51, what *may* be happening is that your computer runs out of RAM while WCG is hybernating and that the sleeping CEP jobs are discarded to recover some space in RAM. That would at least be a possibility and consistent with bigger jobs, more RAM use, and thus higher chance of need to discard sleeping state from it. Best wishes Your Harvard CEP team Thanks for your response. It could be due to lack of memory when the computer is going into hibernation. I have a total of 12 gig of RAM I am running 1 CEP 2 and 11 fight AIDS along with to GPU tasks. As I write this with Firefox outlook and a few other things open my computer is currently using 3.87 gig. CEP 2 job is 9.7% complete after 1 hour 16 minutes however it is only on job number 2 at start of this job just under an hour ago ![]() |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Rick - I have opened an enhancement request with Berkeley for a feature such as this. Please see: http://boinc.berkeley.edu/trac/ticket/1321
I cannot know at this time when it would be implemented. |
||
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks, KNR
For those of you who are experimenting with hibernating/resuming multiple CEP2 tasks, you might try manually staggering the restart of these tasks: - Before you hibernate: .. Suspend all tasks in your cache that are ready to start or waiting to run. .. Then suspend all running CEP2s - Hibernate. - Resume. - Resume the suspended CEP2s one at a time. .. Ideally, you'd wait until the HDD LED goes off before you resume the next WU, but you may be able to shorten that. - Resume all remaining suspended tasks (don't forget!) Not useful if you want to automate things, though a boinccmd wiz might be able to come up with a script. HTH - Rick |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1320 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
@Rick: A minor detail to add, but essential: "Leave applications in memory while suspended" selected in BOINC Manager.
|
||
|
|
![]() |