Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 148
|
![]() |
Author |
|
ThreadRipper
Veteran Cruncher Sweden Joined: Apr 26, 2007 Post Count: 1322 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
All tasks received where BOINC-data is on a RAM-disk, crash directly after the start. CEP2's are running fine from that folder. ERROR: could not initialize graphics pointer in shared memory. BETA_ betaugm1_ ugm1_ 00036_ 0303_ 0-- 3166874 Error 10/12/14 20:12:16 10/12/14 20:13:37 0.00 / 0.00 0.0 / 0.0 BETA_ betaugm1_ ugm1_ 00036_ 0291_ 0-- 3166874 Error 10/12/14 20:12:16 10/12/14 20:13:37 0.00 / 0.00 0.0 / 0.0 BETA_ betaugm1_ ugm1_ 00036_ 0219_ 0-- 3166874 Error 10/12/14 20:12:16 10/12/14 20:13:37 0.00 / 0.00 0.0 / 0.0 BETA_ betaugm1_ ugm1_ 00036_ 0147_ 1-- 3166874 Error 10/12/14 20:12:16 10/12/14 20:13:37 0.00 / 0.00 0.0 / 0.0 BETA_ betaugm1_ ugm1_ 00032_ 0601_ 1-- 3166874 Error 10/12/14 19:51:07 10/12/14 19:56:34 0.00 / 0.00 0.0 / 0.0 Ok, now all my eta tasks finished, running towards RAM-disk and they were all valid (one still pending validation). I know that CEP-tasks error out if the RAM-disk does not have enough space (since CEP uses a lot of storage when running), don't know how these WUs handle such, but perhaps you did not run out of RAM-disk space... ![]() Join The International Team: https://www.worldcommunitygrid.org/team/viewTeamInfo.do?teamId=CK9RP1BKX1 AMD TR2990WX @ PBO, 64GB Quad 3200MHz 14-17-17-17-1T, RX6900XT @ Stock AMD 3800X @ PBO AMD 2700X @ 4GHz |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1323 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
All tasks received where BOINC-data is on a RAM-disk, crash directly after the start. CEP2's are running fine from that folder. ERROR: could not initialize graphics pointer in shared memory. BETA_ betaugm1_ ugm1_ 00036_ 0303_ 0-- 3166874 Error 10/12/14 20:12:16 10/12/14 20:13:37 0.00 / 0.00 0.0 / 0.0 BETA_ betaugm1_ ugm1_ 00036_ 0291_ 0-- 3166874 Error 10/12/14 20:12:16 10/12/14 20:13:37 0.00 / 0.00 0.0 / 0.0 BETA_ betaugm1_ ugm1_ 00036_ 0219_ 0-- 3166874 Error 10/12/14 20:12:16 10/12/14 20:13:37 0.00 / 0.00 0.0 / 0.0 BETA_ betaugm1_ ugm1_ 00036_ 0147_ 1-- 3166874 Error 10/12/14 20:12:16 10/12/14 20:13:37 0.00 / 0.00 0.0 / 0.0 BETA_ betaugm1_ ugm1_ 00032_ 0601_ 1-- 3166874 Error 10/12/14 19:51:07 10/12/14 19:56:34 0.00 / 0.00 0.0 / 0.0 Ok, now all my eta tasks finished, running towards RAM-disk and they were all valid (one still pending validation). I know that CEP-tasks error out if the RAM-disk does not have enough space (since CEP uses a lot of storage when running), don't know how these WUs handle such, but perhaps you did not run out of RAM-disk space... Thanks for reporting. Good to know it's not a RAM-disk issue. No, there was enough RAM-space left, so probably a memory issue. I will try UGM1 again on RAM-disk when the project is operational. lavaflow wrote: Do these actually complete when hitting 100 percent? Suspended 1 of these with laim off, and let it sit a little. Then resumed it. No progress or time was lost, no regression to last checkpoint. Is it really doing anything? It seems that batches 11 and 12 are bad/corrupt. The tasks are consuming 100% CPU, but do not write into the stderr.txt except the first line that always appears: Unable to open checkpoint file starting from 0 Suspending with LAIM off will not restart the task, because there is no single checkpoint, so I restarted BOINC, but without avail. In newer BOINC-versions suspend with LAIM off will only clean the task out of memory when a checkpoint is made, else it will stay in memory despite LAIM off to not waste too much cpu-time. I aborted the tasks of the 11 and 12-batch after wasting 50 hours of CPU-time. [Edit 3 times, last edit by Crystal Pellet at Oct 13, 2014 7:26:48 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Got ya. Never scanned event log back to see if ever a checkpoint was written for these. Yes, no checkpoint, then task is held in memory till at least first one, or agent is restarted.
I'll suspend them and wait on instructions, or technicians can send a forced server abort, think they can for bad tasks. Got enough mcm in buffer to keep cores busy for now. |
||
|
PMH_UK
Veteran Cruncher UK Joined: Apr 26, 2007 Post Count: 774 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I also have some batch 11 that reached 100% on Boinctasks 1.58 but not completed:
----------------------------------------BETA_ betaugm1_ ugm1_ 00011_ 0720_ 1-- USB3-A In Progress 12/10/14 19:25:41 16/10/14 19:25:41 0.00 / 0.00 0.0 / 0.0 BETA_ betaugm1_ ugm1_ 00011_ 0216_ 1-- USB3-A In Progress 12/10/14 19:25:41 16/10/14 19:25:41 0.00 / 0.00 0.0 / 0.0 10.5 hours so far. Suspended for now (LAIM on), will review later. Edit: Had to shut down for now. I also have this one on a slower PC, 66% after 10.3 hours. Edit: BoincTasks 1.58 showed 66% (now 83%) but BoincMGR showed 0% so suspended. BETA_ betaugm1_ ugm1_ 00011_ 0202_ 1-- Intel4 In Progress 12/10/14 19:26:05 16/10/14 19:26:05 0.00 / 0.00 0.0 / 0.0 Wingmen not reported yet on any of these. Paul.
Paul.
----------------------------------------[Edit 1 times, last edit by PMH_UK at Oct 13, 2014 11:14:58 AM] |
||
|
widdershins
Veteran Cruncher Scotland Joined: Apr 30, 2007 Post Count: 674 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I've had units from various batches running on my linux machines over the last couple of days. They've generally been about 10hrs plus or minus a couple of hours. Various versions of Boinc as well, Ye olde machine runs Ubuntu 8 and Boinc 5
----------------------------------------![]() One lot even ran in a Ubuntu 10 VM on my FreeBSD based NAS. (I gave up trying to figure out how to get Boinc to run on FreeBSD). So far none have had any problems, they're either turning valid, sitting in PV or still to complete. So well done everyone. P.S. I wouldn't say no to another Beta though, I've a looong way to go to get my next Beta badge. ![]() [Edit 1 times, last edit by widdershins at Oct 13, 2014 7:41:45 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Got 76 hours invested in these, and running on after 100 percent means they'll continue until maximum time exceeded occurs.
Too many, bad batch for which technicians can determine cause, or show stopper, regardless the weekend rush suggesting a close production move. Was there not similar never ending in earlier betas?. Play with words, but seippel's 'launch date' at wcg is when the ibm pr fanfare rolles, not when going to production which can have weeks in between, when now time is maybe running out to have some supporting 'it's stable' production data before official 'launch'. Apparently having it run and discussed on the forums is not a pr spoiler, so what's up with the secrecy? Uniform gene mapping it is. ![]() |
||
|
[AF>Belgique]Mamouth
Cruncher Joined: Dec 12, 2008 Post Count: 6 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
BETA_betaugm1_ugm1_00025_0132
Still got the maximum disk usage error <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> Maximum disk usage exceeded </message> <stderr_txt> |
||
|
MStenholm
Advanced Cruncher Denmark Joined: Jan 7, 2010 Post Count: 97 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
BETA_ betaugm1_ ugm1_ 00011_ 0713_ 0 ran for 2½ hours with 0 % progress. It has been detached, sorry.
----------------------------------------Edit / update BETA_ betaugm1_ ugm1_ 00011_ 0755_ 1 ran for 1:50 hours with 0 % progress BETA_ betaugm1_ ugm1_ 00011_ 0692 1 ran for 1:00 hours with 0 % progress ![]() [Edit 2 times, last edit by MStenholm at Oct 13, 2014 11:29:22 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
BETA_betaugm1_ugm1_00025_0132 Still got the maximum disk usage error <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> Maximum disk usage exceeded </message> <stderr_txt> Is it the task, or is it agent preferences overall? My setting is 10gb allowed, when there's 70gb free.[Edit 1 times, last edit by Former Member at Oct 13, 2014 11:23:04 AM] |
||
|
rbotterb
Senior Cruncher United States Joined: Jul 21, 2005 Post Count: 401 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Finished up my four beta WUs. Looks like all four completed OK on my laptop. Three are waiting for wingman to complete.
BETA_ betaugm1_ ugm1_ 00046_ 0419_ 0-- Pavilion-dv7 Valid 10/12/14 20:26:30 10/13/14 11:27:00 5.18 / 5.19 94.8 / 71.7 BETA_ betaugm1_ ugm1_ 00046_ 0374_ 1-- Pavilion-dv7 Pending Validation 10/12/14 20:26:30 10/13/14 11:27:00 5.14 / 5.15 94.1 / 0.0 BETA_ betaugm1_ ugm1_ 00044_ 0157_ 0-- Pavilion-dv7 Pending Validation 10/12/14 20:24:17 10/13/14 11:27:00 6.12 / 6.14 112.0 / 0.0 BETA_ betaugm1_ ugm1_ 00027_ 0125_ 1-- Pavilion-dv7 Pending Validation 10/12/14 19:44:35 10/13/14 01:30:40 4.21 / 4.21 76.9 / 0.0 |
||
|
|
![]() |