Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 6
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 565 times and has 5 replies Next Thread
kateiacy
Veteran Cruncher
USA
Joined: Jan 23, 2010
Post Count: 1027
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
wingman's wierd error

Here's the result log from a wingman's WU:

Result Log

Result Name: E202868_ 572_ C.27.C23H13NOS2.00074626.0.set1d06_ 1--

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[22:12:46] Number of jobs = 16
[22:12:46] Starting job 0,CPU time has been restored to 0.000000.
[22:12:46] Starting new Job
[22:12:46] Qink name = fldman
[22:12:47] Qink name = gesman
[22:12:47] Qink name = scfman
Killing job because cpu time limit has been exceeded. 0.000000||1266874890.181581||0.000000
[22:12:47] Finished Job #0
22:12:47 (8428): called boinc_finish

</stderr_txt>
]]>

CPU time limit exceeded in 1 second?? It's even odder because the matching Results Status entry shows 8.74 hrs CPU time:

E202868_ 572_ C.27.C23H13NOS2.00074626.0.set1d06_ 1-- 640 Error 8/5/11 18:37:29 8/6/11 03:22:04 8.74 66.1 / 0.0

I noticed this because my WU came up as inconclusive -- it had completed all 16 jobs, which obviously doesn't match this!
----------------------------------------

[Aug 6, 2011 11:02:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: wingman's wierd error

Hello kateiacy,
Logs like that make me think that a reboot would be a good idea for your wingman, in case some memory bit has flipped.

[Shrug]
Lawrence
[Aug 7, 2011 12:51:10 AM]   Link   Report threatening or abusive post: please login first  Go to top 
KWSN - A Shrubbery
Master Cruncher
Joined: Jan 8, 2006
Post Count: 1585
Status: Offline
Reply to this Post  Reply with Quote 
Re: wingman's wierd error

Just a possibility, but your wingperson may have changed their system time. BOINC doesn't always deal well with that happening.
----------------------------------------

Distributed computing volunteer since September 27, 2000
[Aug 7, 2011 1:34:09 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: wingman's wierd error

I don't know if this is related, but I still have a problem with CEP2 aborting jobs after a reboot. I don't know if there are issues resuming from a checkpoint or what, but it becomes a huge slowdown when 4 jobs try to start at the same time as windows services, etc. It makes my system basically unusable for 10-15 minutes after the reboot when this happens. It's gotten to the point where I will suspend BOINC, reboot, then cut the # of CPUs to two, let those two start, then set the number of CPUs back to 4. This project is a real pain, and if I didn't think it was so important, I wouldn't be crunching for it. As it is, I may take a break for awhile after 2 years crunch time.
[Aug 15, 2011 12:01:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: wingman's wierd error

Hi pleskinen,

1. BOINC can be set to delay starting computing, so bootup goes much quicker. This is done by added/editing the option

<start_delay>60</start_delay>

to the cc_config.xml file, the value is in seconds.

2. Don't know what client you run. Some basically shut down too quick on Vista/W7, damaging the WU's. If you have 6.10.58 that is not an issue. When shutting down you can always stop the BOINC service first so the client has time to store the tasks in progress.

3. CEP2 is opt in, with a default of 1 per machine because they are very demanding, certainly when you also want to use the computer without it interfering [you can always let BOINC pause automatically when there's user input]. You don't have to play with number of CPU's. You can set the number of CEP2 that are assigned to a machine and let the rest run on something lighter such as HCC, HCMD2, C4CW.

Let us know and we'll take it from what route you'd like to take in changes.

--//--

edit: bootup
----------------------------------------
[Edit 1 times, last edit by Former Member at Aug 15, 2011 1:29:24 PM]
[Aug 15, 2011 1:24:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: wingman's wierd error

I appreciate the tips. I have added the start_delay parameter to my cc_config. If I can remember to 'snooze' BOINC before reboots, I think things will be much better. I hate to run less than 4 WUs at a time since I have 4 physical cores (I could probably run 5 or 6 even), but the occasional dumping of all 4 WU's at the same time has prevented me from running more.

One other unavoidable problem occurs on a reboot when BOINC decides there are a few units in need of "High priority"--it will basically start 4 of those and leave the others suspended. While it will pick those up eventually, you still get the LONG start as a result of starting 4 new units at the same time.

I will keep an eye on things and see if the problem recurs even when i 'snooze' before reboot.
----------------------------------------
[Edit 1 times, last edit by Former Member at Aug 15, 2011 7:42:59 PM]
[Aug 15, 2011 7:42:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread