Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 28
Posts: 28   Pages: 3   [ Previous Page | 1 2 3 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 18246 times and has 27 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Stuck unit?

I've got about half 7.24 and half 7.26 jobs. I've yet to get a stuck unit here. Tha 7.24's seem to start and finish normally. I will keep an eye on the 7.24's and post any details here if I do get one that sticks.

Edit: The 7.24's I have end in 2 or 3.
----------------------------------------
[Edit 1 times, last edit by Former Member at Nov 22, 2013 6:23:16 PM]
[Nov 22, 2013 6:19:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Warped@RSA
Senior Cruncher
South Africa
Joined: Jan 15, 2006
Post Count: 422
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck unit?

This work unit seems to be a problem:
https://secure.worldcommunitygrid.org/ms/devi...s.do?workunitId=886577611

After more than 8 hours of running, I am going to abort my attempt.
I arrived home to find it on 100%, with no checkpoint after more than 7 hours. Tried suspend/restart and no change.

System details as follows:
BOINC client version 7.0.64 for windows_x86_64
Processor: Intel(R) Core(TM) i7-3770S CPU @ 3.10GHz
OS: Microsoft Windows 7: Home Premium x64 Edition, Service Pack 1, (06.01.7601.00)
Memory: 7.90 GB physical, 15.79 GB virtual
HT off
----------------------------------------
Dave


[Nov 22, 2013 6:31:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck unit?

Here is another one that is still running @100% complete with no checkpoints
https://secure.worldcommunitygrid.org/ms/devi...s.do?workunitId=886322286

1. Win7 64 bit SP1
2.BOINC 7.2.11 x64
3. Yes-Run time continues to go up/remaining time @ 0:00
4.Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.24_windows_x86_64 -SettingsFile MCM1_0000058_6406.txt -DatabaseFile dataset-17_72_SDG_v1.txt
Initializing
wcg_learn_limit = 500000
Running
5. Only message is: 22-Nov-2013 14:56:55 [World Community Grid] Starting task MCM1_0000058_6406_3 using mcm1 version 724 in slot 7
6. Client settings all CPUs/100% LAIM off

Since I'm #3 with no others reporting I believe this is one of those duds I mentioned earlier that has no chance of finishing. Suspending for now, will abort if the techs have no solution.
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


----------------------------------------
[Edit 2 times, last edit by nanoprobe at Nov 22, 2013 11:21:45 PM]
[Nov 22, 2013 10:09:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Stuck unit?

I notice that the dodgy jobs in this thread all seem to be running on 64 bit OS's. I wonder if that is relevant ? Maybe, maybe not ?
----------------------------------------
[Edit 1 times, last edit by Former Member at Nov 22, 2013 10:48:44 PM]
[Nov 22, 2013 10:48:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7666
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck unit?

I notice that the dodgy jobs in this thread all seem to be running on 64 bit OS's. I wonder if that is relevant ? Maybe, maybe not ?

Interesting point. I have two systems which are running a 32 bit version of Linux and they have not yet experienced the problem, although all three 64bit systems, while only having one stuck unit all have a number of invalids. No invalids on the 32 bit systems. Strange.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Nov 23, 2013 2:27:26 AM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck unit?

I notice that the dodgy jobs in this thread all seem to be running on 64 bit OS's. I wonder if that is relevant ? Maybe, maybe not ?

Interesting point. I have two systems which are running a 32 bit version of Linux and they have not yet experienced the problem, although all three 64bit systems, while only having one stuck unit all have a number of invalids. No invalids on the 32 bit systems. Strange.
Cheers

I've had 3 of these 100% but keep running tasks, 1 on XP 32 and 2 on Win7 64 and all 3 were 7.24 WUs. No invalids so far on the 7.26 WUs. Had a couple on the 7.24.
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


[Nov 23, 2013 3:36:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Nevada Bob
Senior Cruncher
USA
Joined: Dec 8, 2008
Post Count: 174
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck unit?

MCM1_ 0000042_ 0250_ 2--

Above unit ran for 51 hours 50 minutes. Says it is at 100% completed but keeps showing an increasing time total as if it is still crunching. It has two other computers that say "No Reply" and one other computer still computing. Don't know what else to discover that you guys can find out about it.
But this one needs to be reported as a problem.
[Nov 23, 2013 8:11:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
sartaonline
Advanced Cruncher
USA
Joined: Sep 24, 2013
Post Count: 96
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck unit?

Same issue it seems, unit stuck running so far for 28hrs at 100%

1. Operating System and Version: Windows 7 Professional 64-bit
BOINC Client version: 7.2.28
2. Does the process continue incrementing CPU time (Windows use task manager, Unix use top or ps): Yes
3. Locate the slot dir it is running in and check the contents of stderr.txt post its contents:
Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.24_windows_x86_64 -SettingsFile MCM1_0000060_9013.txt -DatabaseFile dataset-17_72_SDG_v1.txt
Initializing
wcg_learn_limit = 500000
Running

4. Search the log file stdoutdae.txt in the boinc data directory for any entries with the workunit id and post any messages:
Line 22667: 21-Nov-2013 20:17:52 [World Community Grid] Started download of MCM1_0000060_9013_MCM1_0000060_9013.txt
Line 22667: 21-Nov-2013 20:17:52 [World Community Grid] Started download of MCM1_0000060_9013_MCM1_0000060_9013.txt
Line 22669: 21-Nov-2013 20:17:53 [World Community Grid] Finished download of MCM1_0000060_9013_MCM1_0000060_9013.txt
Line 22669: 21-Nov-2013 20:17:53 [World Community Grid] Finished download of MCM1_0000060_9013_MCM1_0000060_9013.txt
Line 22740: 22-Nov-2013 02:36:08 [World Community Grid] Starting task MCM1_0000060_9013_3 using mcm1 version 724 in slot 1

5. What are the BOINC client settings for CPU throttle and LAIM
CPU is set for 100% after 1 minute of no activity, to use 100% of all CPUs. I don't know what LAIM is. LAIM is OFF but after reading what this is, I have enabled it
----------------------------------------
----------------------------------------
[Edit 3 times, last edit by sartaonline at Nov 23, 2013 4:13:01 PM]
[Nov 23, 2013 4:01:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 28   Pages: 3   [ Previous Page | 1 2 3 ]
[ Jump to Last Post ]
Post new Thread