Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 98
Posts: 98   Pages: 10   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 10638 times and has 97 replies Next Thread
BKraayev
Cruncher
Joined: Mar 23, 2005
Post Count: 46
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New BETA test - Sept 18, 2014 [ Issues Thread ]

found the same thing - updated BONIC and tasks that had been stuck on 0% after 9 hours started to run normally
----------------------------------------

[Sep 19, 2014 4:43:58 AM]   Link   Report threatening or abusive post: please login first  Go to top 
yoro42
Ace Cruncher
United States
Joined: Feb 19, 2011
Post Count: 8979
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New BETA test - Sept 18, 2014 [ Issues Thread ]


----------------------------------------

[Sep 19, 2014 6:02:28 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mumak
Senior Cruncher
Joined: Dec 7, 2012
Post Count: 477
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New BETA test - Sept 18, 2014 [ Issues Thread ]

I had about 50 tasks stuck at various levels (from 0% to 99.999-100.0%). Aborted them all.
----------------------------------------

[Sep 19, 2014 6:18:34 AM]   Link   Report threatening or abusive post: please login first  Go to top 
ca05065
Senior Cruncher
Joined: Dec 4, 2007
Post Count: 325
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New BETA test - Sept 18, 2014 [ Issues Thread ]

BETA_ugm1_ugm1_00011_0331_1
BETA_ugm1_ugm1_00011_0360_0
BETA_ugm1_ugm1_00011_0367_0
BETA_ugm1_ugm1_00011_0333_1
BETA_ugm1_ugm1_00011_0334_1

I had suspended two after 6 hours, but let the others run on and they did reach 100% but kept running increasing CPU time to over 10 hours. I have aborted them all.
[Sep 19, 2014 6:42:39 AM]   Link   Report threatening or abusive post: please login first  Go to top 
OldChap
Veteran Cruncher
UK
Joined: Jun 5, 2009
Post Count: 978
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New BETA test - Sept 18, 2014 [ Issues Thread ]

No checkpointing, no time left indication, 1 stuck at 3.93% after 11 hours, 2 at 80% in 2 hours, 1 at 99.95% after 11 hours and 33 are at 100% after 11 hrs 30 mins but still using cpu and not finishing. This across windows and Linux mint, Latest Boinc and slightly older boinc.
----------------------------------------

[Sep 19, 2014 6:45:38 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Ian_UK
Senior Cruncher
England
Joined: Oct 15, 2006
Post Count: 153
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New BETA test - Sept 18, 2014 [ Issues Thread ]

BETA_ ugm1_ ugm1_ 00010_ 0849_ 0 on Linux aborted as 0% after 11hrs :07
----------------------------------------
[Sep 19, 2014 7:27:41 AM]   Link   Report threatening or abusive post: please login first  Go to top 
hendermd
Cruncher
United States
Joined: Apr 30, 2010
Post Count: 29
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New BETA test - Sept 18, 2014 [ Issues Thread ]

I have two work units BETA_ugm1_ugm1_0010_0010 and _ugm1_ugm1_0010_0012 stuck at 3.963%, restarted 1 and it worked back up to 3.963%.

The following work units show no time remaining and still seem to be working slowly in the 99% range for last two hours.

BETA_ugm1_ugm1_0011_0443
BETA_ugm1_ugm1_0011_0454
BETA_ugm1_ugm1_0011_0445

Aborted all 3 at 11.5 hours after reaching 100% and not completing, so all 5 beta received had to be aborted.
----------------------------------------

[Sep 19, 2014 7:36:18 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New BETA test - Sept 18, 2014 [ Issues Thread ]

Everybody will be wondering 'what happened between alpha and beta?'. More interruption due this security feature that was added to the other science apps just before?

Took the wink wink by uplinger and tweaked the profiles to get all cores occupied with beta19 and let 1 run for 5 hours. The progress percent and remaining time were never converging, like trying to compute the umpteen fraction to find the perfect pi. http://www.cbsnews.com/news/pi-calculated-to-its-ten-trillionth-digit/

For the time being, suspended all. Aborting would lead to other volunteer devices most likely ending up doing the same. Will wait on tech instruction, and unsuspend them now and then to fetch more work of the regular operation.

Btw, someone commented on the boinc_lockfile being an indicator of the why. Looked and all production job slots have this, mcm and faah and fahv i.a.w. a standard function.
[Sep 19, 2014 8:18:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New BETA test - Sept 18, 2014 [ Issues Thread ]

BETA_ ugm1_ ugm1_ 00010_ 1073 completed in the 0010 series estimated time of 30 mins. (No idea whether it checkpointed, as I didn't even notice it had run until after it was done and validated.)

These have been running for over 10 hours with no checkpoint yet:
BETA_ ugm1_ ugm1_ 00012_ 1090
BETA_ ugm1_ ugm1_ 00012_ 0466
[Sep 19, 2014 8:19:22 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New BETA test - Sept 18, 2014 [ Issues Thread ]

Same thing, 0 progress after 8 and 10 h run time, user-abort. Both old Linux lappy, had high ram usage from running 1 vlhc on vbox and 1 suspended Einstein@h task.

Result Name: BETA_ ugm1_ ugm1_ 00011_ 0276_ 1--
<core_client_version>7.0.27</core_client_version>
<![CDATA[
<message>
aborted by user
</message>
<stderr_txt>
Unable to open checkpoint file starting from 0

</stderr_txt>
]]>
[Sep 19, 2014 8:35:26 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 98   Pages: 10   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread