Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 2
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1482 times and has 1 reply Next Thread
ca05065
Senior Cruncher
Joined: Dec 4, 2007
Post Count: 325
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Problem with checkpoint restart

I had a blue screen of death failure on my Windows PC.
After restarting the PC two OET workunits remembered the elapsed time but lost the CPU time from before the failure and started from the beginning of the workunit.
In other words workunit did not restart from the latest checkpoint.
An extract from the stderr file.

Result Name: OET1_ 0000064_ xSDGP_ 0989_ 1--
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[17:11:51] Number of tasks = 84
[17:11:51] Starting task 0,CPU time is 0.000000.
[17:11:51] ./ZINC17022860.pdbqt size = 32 8 ../../projects/www.worldcommunitygrid.org/oet1.xSDGP.pdbqt size = 2493 0
[17:15:52] Vina exited normal 0.
[17:15:52] Finished task #0 cpu time used 237.137120
[17:15:52] Starting task 1,CPU time is 237.137120.
.
.
.
[18:05:46] Finished task #32 cpu time used 123.802394
[18:05:46] Starting task 33,CPU time is 3185.524820.
[18:05:46] ./ZINC17022966.pdbqt size = 25 6 ../../projects/www.worldcommunitygrid.org/oet1.xSDGP.pdbqt size = 2493 0
INFO: No state to restore. Start from the beginning.
[18:12:02] Number of tasks = 84
[18:12:02] Starting task 0,CPU time is 0.000000.
[18:12:02] ./ZINC17022860.pdbqt size = 32 8 ../../projects/www.worldcommunitygrid.org/oet1.xSDGP.pdbqt size = 2493 0
[18:16:14] Vina exited normal 0.
[18:16:14] Finished task #0 cpu time used 238.416328
[18:16:14] Starting task 1,CPU time is 238.416328.
.
.
.
[20:27:52] Vina exited normal 0.
[20:27:52] Finished task #83 cpu time used 77.236095
20:27:52 (1956): called boinc_finish

</stderr_txt>
]]>
[Dec 5, 2014 11:17:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
armstrdj
Former World Community Grid Tech
Joined: Oct 21, 2004
Post Count: 695
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Problem with checkpoint restart

We are working on some updates to the checkpointing code and will take a look at this issue.

Thanks,
armstrdj
[Dec 8, 2014 3:28:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread