Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 25
Posts: 25   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1710 times and has 24 replies Next Thread
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 11818
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
sad CEP2 Crashing

I have had the following happen twice in 2 days:

02/07/2013 19:11:17 | World Community Grid | Task E214114_340_C.36.C29H13N3S4.00446664.3.set1d06_0 exited with zero status but no 'finished' file
02/07/2013 19:11:17 | World Community Grid | If this happens repeatedly you may need to reset the project.
02/07/2013 19:11:17 | World Community Grid | Task E214164_855_C.35.C27H16N4S2Si2.01505869.1.set1d06_0 exited with zero status but no 'finished' file
02/07/2013 19:11:17 | World Community Grid | If this happens repeatedly you may need to reset the project.
02/07/2013 19:11:17 | World Community Grid | Task E214162_747_C.37.C28H12N6OS2.01183399.2.set1d06_0 exited with zero status but no 'finished' file
02/07/2013 19:11:17 | World Community Grid | If this happens repeatedly you may need to reset the project.
02/07/2013 19:11:17 | World Community Grid | Task E214164_770_C.34.C29H18S3Si2.01160521.1.set1d06_0 exited with zero status but no 'finished' file
02/07/2013 19:11:17 | World Community Grid | If this happens repeatedly you may need to reset the project.
02/07/2013 19:11:17 | World Community Grid | Task E214163_998_C.36.C29H14N4S3.01369567.1.set1d06_0 exited with zero status but no 'finished' file
02/07/2013 19:11:17 | World Community Grid | If this happens repeatedly you may need to reset the project.
02/07/2013 19:11:17 | World Community Grid | Task E214164_567_C.36.C27H14N6S2Si.01351458.2.set1d06_0 exited with zero status but no 'finished' file
02/07/2013 19:11:17 | World Community Grid | If this happens repeatedly you may need to reset the project.
02/07/2013 19:11:17 | World Community Grid | Task E214163_853_C.36.C30H14N2OS3.01541700.1.set1d06_0 exited with zero status but no 'finished' file
02/07/2013 19:11:17 | World Community Grid | If this happens repeatedly you may need to reset the project.
02/07/2013 19:11:17 | World Community Grid | Computation for task E214114_167_C.35.C26H13N5S2SeSi.00998821.2.set1d06_0 finished
02/07/2013 19:11:17 | World Community Grid | Restarting task E214114_340_C.36.C29H13N3S4.00446664.3.set1d06_0 using cep2 version 640 in slot 5
02/07/2013 19:11:48 | World Community Grid | Restarting task E214164_855_C.35.C27H16N4S2Si2.01505869.1.set1d06_0 using cep2 version 640 in slot 7
02/07/2013 19:11:48 | World Community Grid | Restarting task E214162_747_C.37.C28H12N6OS2.01183399.2.set1d06_0 using cep2 version 640 in slot 3
02/07/2013 19:11:48 | World Community Grid | Restarting task E214164_770_C.34.C29H18S3Si2.01160521.1.set1d06_0 using cep2 version 640 in slot 4
02/07/2013 19:11:48 | World Community Grid | Restarting task E214163_998_C.36.C29H14N4S3.01369567.1.set1d06_0 using cep2 version 640 in slot 1
02/07/2013 19:11:48 | World Community Grid | Restarting task E214164_567_C.36.C27H14N6S2Si.01351458.2.set1d06_0 using cep2 version 640 in slot 2
02/07/2013 19:11:48 | World Community Grid | Restarting task E214163_853_C.36.C30H14N2OS3.01541700.1.set1d06_0 using cep2 version 640 in slot 0
02/07/2013 19:11:48 | World Community Grid | Starting task E214164_221_C.36.C30H14N2OS3.01297820.1.set1d06_0 using cep2 version 640 in slot 6

On both occasions more than half the work had been completed for most of thw units when it happened.

Other units have finished correctly as did one in the listing above.

Does anyone have any suggestions to stop it, please?

I have an Intel i7-3770 CPU @ 3.40GHz with hyperthreading.

Mike
[Jul 2, 2013 10:09:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: CEP2 Crashing

That is a lot of CEP2 tasks to run simultaneously. What are your system specs, as shown during BOINC startup. Are your profile settings big enough to keep your system running? You can always check that by cutting your thread count in half.

Lawrence
[Jul 3, 2013 12:39:17 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 11818
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: CEP2 Crashing

Lawrence

02/07/2013 03:43:00 | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz [Family 6 Model 58 Stepping 9]
02/07/2013 03:43:00 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx smx tm2 pbe
02/07/2013 03:43:00 | | OS: Microsoft Windows 7: Professional x64 Edition, Service Pack 1, (06.01.7601.00)
02/07/2013 03:43:00 | | Memory: 3.81 GB physical, 7.62 GB virtual
02/07/2013 03:43:00 | | Disk: 916.37 GB total, 852.04 GB free

Thanks

Mike
[Jul 3, 2013 2:36:49 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: CEP2 Crashing

You could try reading this
[Jul 3, 2013 4:58:07 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: CEP2 Crashing

Also look at the system requirements page at https://secure.worldcommunitygrid.org/help/viewTopic.do?shortName=minimumreq which asks for 1 GB RAM per CEP2 thread. As I suspected, your system has a lot less than 8 GB memory. CEP2 is very memory-hungry. Try reducing the number of CEP2 threads.

Lawrence
[Jul 3, 2013 6:17:14 AM]   Link   Report threatening or abusive post: please login first  Go to top 
candid
Cruncher
Joined: Apr 3, 2013
Post Count: 2
Status: Offline
Reply to this Post  Reply with Quote 
Re: CEP2 Crashing

I have several CEP 2 crashed too.
But in my case the message i got is different:
Output file E214253_630_C.36.C31H16N2O2S.01246556.1.set1d06_ 0_0 for task E214253_ 630_C.36.C31H16N2O2S.01246556.1.set1d06_ 0 is absent.

I also have this Result Log

Result Name: E214253_ 630_ C.36.C31H16N2O2S.01246556.1.set1d06_ 0--



<core_client_version>7.0.27</core_client_version>
<![CDATA[
<message>
process got signal 11
</message>
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[07:33:39] Number of jobs = 16
[07:33:39] Starting job 0,CPU time has been restored to 0.000000.
[07:34:13] Starting new Job
[07:34:14] Qink name = fldman
[07:34:15] Qink name = gesman
[07:34:15] Qink name = scfman
[07:57:02] Qink name = anlman
[07:58:34] End of Job
[07:58:35] Finished Job #0
[07:58:35] Starting job 1,CPU time has been restored to 1233.709102.
[07:58:40] Starting new Job
[07:58:40] Qink name = fldman
[07:59:24] Qink name = gesman
[07:59:26] Qink name = scfman
[09:23:52] Qink name = anlman

</stderr_txt>
]]>

I am running two wu simultaneously, have 2 GB RAM and 3 GB Hard drive available for Boinc...

Could some one help?
Thanks
----------------------------------------


----------------------------------------
[Edit 1 times, last edit by candid at Jul 3, 2013 11:50:26 AM]
[Jul 3, 2013 11:47:43 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: CEP2 Crashing

With each WU requiring 1 gig RAM and you only having 2 gig it maybe cutting it fine, either give more RAM to BOINC or run only one CEP and see what happens.... biggrin
[Jul 3, 2013 12:02:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: CEP2 Crashing

I also found this
----------------------------------------
[Edit 1 times, last edit by Former Member at Jul 3, 2013 12:08:03 PM]
[Jul 3, 2013 12:07:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: CEP2 Crashing

Hello candid,
This does not ring any bells with me. Here is a typical SIGNAL 11 error: https://secure.worldcommunitygrid.org/forums/...ead,30167_offset,0#300160
As you can see on the FAQ at http://boincfaq.mundayweb.com/index.php?view=165 SIGNAL 11 can be caused by many different interrupt signals.

I am interested in why you only have 3 GB of hard disk available for BOINC. As you can see, the system requirements at https://secure.worldcommunitygrid.org/help/viewTopic.do?shortName=minimumreq want 2 GB RAM and 4 GB hard disk for 2 threads of CEP2. These requirements are generally much bigger than are actually required, but if you start going below them you should carefully check your profile settings to make sure that everything is at 100% and not artificially lowering already small allotments of memory and hard disk.

Lawrence
[Jul 3, 2013 12:14:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
candid
Cruncher
Joined: Apr 3, 2013
Post Count: 2
Status: Offline
Reply to this Post  Reply with Quote 
Re: CEP2 Crashing

Many thanks everyone for the quick replies.
I am in fact running low on HD space, but by my experiece the 2 GB per WU are were more than necessary for my i7 with 8 GB RAM.
But in this case I am running an atom from a USB stick with no real HD. that is why i have only 3GB 'HD' space for boinc.
The software I used (lili usb creator) does not allow more than 4 GB and some has to be left for the system (ubuntu 12).
I am now going to try to run one CEP only with some other type of wu (faah). Lowered to usage limit to 35%. I always check the option to leave app in memory while suspended.

while writting this i got an error message from the OS: compiz failed. The OS is asking me to continue or relaunch the OS.
I remember that I had seen this before and I did rever relaunch.
Could this be (part of) the problem?
Other wu such as faah and dsl did not fail because of this failure in compiz.

Thanks
----------------------------------------


----------------------------------------
[Edit 2 times, last edit by candid at Jul 3, 2013 12:47:07 PM]
[Jul 3, 2013 12:40:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 25   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread