Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 98
|
![]() |
Author |
|
BKraayev
Cruncher Joined: Mar 23, 2005 Post Count: 46 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
found the same thing - updated BONIC and tasks that had been stuck on 0% after 9 hours started to run normally
----------------------------------------![]() |
||
|
yoro42
Ace Cruncher United States Joined: Feb 19, 2011 Post Count: 8979 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
![]() ![]() |
||
|
Mumak
Senior Cruncher Joined: Dec 7, 2012 Post Count: 477 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I had about 50 tasks stuck at various levels (from 0% to 99.999-100.0%). Aborted them all.
----------------------------------------![]() |
||
|
ca05065
Senior Cruncher Joined: Dec 4, 2007 Post Count: 325 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
BETA_ugm1_ugm1_00011_0331_1
BETA_ugm1_ugm1_00011_0360_0 BETA_ugm1_ugm1_00011_0367_0 BETA_ugm1_ugm1_00011_0333_1 BETA_ugm1_ugm1_00011_0334_1 I had suspended two after 6 hours, but let the others run on and they did reach 100% but kept running increasing CPU time to over 10 hours. I have aborted them all. |
||
|
OldChap
Veteran Cruncher UK Joined: Jun 5, 2009 Post Count: 978 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No checkpointing, no time left indication, 1 stuck at 3.93% after 11 hours, 2 at 80% in 2 hours, 1 at 99.95% after 11 hours and 33 are at 100% after 11 hrs 30 mins but still using cpu and not finishing. This across windows and Linux mint, Latest Boinc and slightly older boinc.
----------------------------------------![]() |
||
|
Ian_UK
Senior Cruncher England Joined: Oct 15, 2006 Post Count: 153 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
BETA_ ugm1_ ugm1_ 00010_ 0849_ 0 on Linux aborted as 0% after 11hrs :07
---------------------------------------- |
||
|
hendermd
Cruncher United States Joined: Apr 30, 2010 Post Count: 29 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have two work units BETA_ugm1_ugm1_0010_0010 and _ugm1_ugm1_0010_0012 stuck at 3.963%, restarted 1 and it worked back up to 3.963%. The following work units show no time remaining and still seem to be working slowly in the 99% range for last two hours. BETA_ugm1_ugm1_0011_0443 BETA_ugm1_ugm1_0011_0454 BETA_ugm1_ugm1_0011_0445 Aborted all 3 at 11.5 hours after reaching 100% and not completing, so all 5 beta received had to be aborted. ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Everybody will be wondering 'what happened between alpha and beta?'. More interruption due this security feature that was added to the other science apps just before?
Took the wink wink by uplinger and tweaked the profiles to get all cores occupied with beta19 and let 1 run for 5 hours. The progress percent and remaining time were never converging, like trying to compute the umpteen fraction to find the perfect pi. http://www.cbsnews.com/news/pi-calculated-to-its-ten-trillionth-digit/ For the time being, suspended all. Aborting would lead to other volunteer devices most likely ending up doing the same. Will wait on tech instruction, and unsuspend them now and then to fetch more work of the regular operation. Btw, someone commented on the boinc_lockfile being an indicator of the why. Looked and all production job slots have this, mcm and faah and fahv i.a.w. a standard function. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
BETA_ ugm1_ ugm1_ 00010_ 1073 completed in the 0010 series estimated time of 30 mins. (No idea whether it checkpointed, as I didn't even notice it had run until after it was done and validated.)
These have been running for over 10 hours with no checkpoint yet: BETA_ ugm1_ ugm1_ 00012_ 1090 BETA_ ugm1_ ugm1_ 00012_ 0466 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Same thing, 0 progress after 8 and 10 h run time, user-abort. Both old Linux lappy, had high ram usage from running 1 vlhc on vbox and 1 suspended Einstein@h task.
Result Name: BETA_ ugm1_ ugm1_ 00011_ 0276_ 1-- <core_client_version>7.0.27</core_client_version> <![CDATA[ <message> aborted by user </message> <stderr_txt> Unable to open checkpoint file starting from 0 </stderr_txt> ]]> |
||
|
|
![]() |