Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 752
Posts: 752   Pages: 76   [ Previous Page | 30 31 32 33 34 35 36 37 38 39 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1648689 times and has 751 replies Next Thread
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1323
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Limited workunits for Help Stop TB

New results should be flowing soon. We had an issue with script that loads work. It has been fixed and we will soon be loading work for HST again.

Are this new results flowing:
HST1_ 005564_ 000076_ MT0008_ T300_ F00008_ S00007_ 0-- AH1 Error 7/16/16 16:36:51 7/16/16 16:43:01 0.00 / 0.00 0.1 / 0.0
HST1_ 005564_ 000001_ MT0007_ T325_ F00080_ S00007_ 0-- AH1 Error 7/16/16 16:36:51 7/16/16 16:43:01 0.00 / 0.00 0.1 / 0.0

My system was not flowing but starving when those two started.
System on his knees and not responsive until those 2 tasks died.

Error:
<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -1073740940 (0xc0000374)
</message>
<stderr_txt>
INFO: result number = 0
INFO: No state to restore. Start from the beginning.
[18:37:01] INFO: Running initial simulation

</stderr_txt>

Mixed up something with 32 and 64bit libraries?
[Jul 16, 2016 4:54:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Limited workunits for Help Stop TB

Posting from my cell phone, there may be some confusion between mac and other os's. I had to assign those set for Mac to go to any os. But it should have assigned a fresh os, and not confused between 32 and 64. I can't be certain. What caused that issue, as I am not near my computer.

Thanks,
Uplinger
[Jul 16, 2016 5:57:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
PMH_UK
Veteran Cruncher
UK
Joined: Apr 26, 2007
Post Count: 774
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Limited workunits for Help Stop TB

Still something not right.
This is only HST I have had for about 36 hours and it has been waiting to send another copy for a couple of hours.

Project Name: Help Stop TB
Created: 07/16/2016 17:33:11
Name: HST1_005636_000087_AT0007_T350_F00088_S00007
Minimum Quorum: 2
Replication: 2

Result Name OS type OS version App Version Number Status Sent Time Time Due /
Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit
HST1_ 005636_ 000087_ AT0007_ T350_ F00088_ S00007_ 0-- Linux 2.6.32-504.8.1.el6.x86_64 721 Error 16/07/16 17:38:29 17/07/16 07:19:08 0.00 0.1 / 0.0
HST1_ 005636_ 000087_ AT0007_ T350_ F00088_ S00007_ 1-- Linux 3.19.0-25-generic - In Progress 16/07/16 17:38:27 26/07/16 17:38:27 0.00 0.0 / 0.0
HST1_ 005636_ 000087_ AT0007_ T350_ F00088_ S00007_ 2-- - Waiting to be sent — — 0.00 0.0 / 0.0

Edit: Mine errored code 193 (0xc1, -63) like -0 and now 2 waiting to send.

HST1_ 005636_ 000087_ AT0007_ T350_ F00088_ S00007_ 0-- Linux 2.6.32-504.8.1.el6.x86_64 721 Error 16/07/16 17:38:29 17/07/16 07:19:08 0.00 0.1 / 0.0
HST1_ 005636_ 000087_ AT0007_ T350_ F00088_ S00007_ 1-- Linux 3.19.0-25-generic 721 Error 16/07/16 17:38:27 17/07/16 10:12:37 0.00 0.0 / 0.0
HST1_ 005636_ 000087_ AT0007_ T350_ F00088_ S00007_ 2-- - Waiting to be sent — — 0.00 0.0 / 0.0
HST1_ 005636_ 000087_ AT0007_ T350_ F00088_ S00007_ 3-- - Waiting to be sent — — 0.00 0.0 / 0.0

Paul.
----------------------------------------
Paul.
----------------------------------------
[Edit 1 times, last edit by PMH_UK at Jul 17, 2016 1:51:46 PM]
[Jul 17, 2016 8:48:12 AM]   Link   Report threatening or abusive post: please login first  Go to top 
widdershins
Veteran Cruncher
Scotland
Joined: Apr 30, 2007
Post Count: 674
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Limited workunits for Help Stop TB

Yup, something needs looking at.

Of the few units I've been getting about 50% of those in the last couple of days have either errored out, or clogged up the box by hogging resources and needed to be killed with fire due to memory faults. Unless it's now normal for these WU's to use 25GB of memory and complain it still isn't enough!

Other WU's for Cancer and Zika run fine, so it's not a machine issue. Different Boinc Versions and different OS versions (though all 64bit Linux) so 99.9% certain it's a WU problem within certain batches.
[Jul 17, 2016 12:04:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Limited workunits for Help Stop TB

I continue to get the message below and no tasks. Linux 64 bit.

7200 World Community Grid 7/17/2016 9:41:03 AM Requesting new tasks for CPU
7201 World Community Grid 7/17/2016 9:41:05 AM Scheduler request completed: got 0 new tasks
7202 World Community Grid 7/17/2016 9:41:05 AM No tasks sent
7203 World Community Grid 7/17/2016 9:41:05 AM No tasks are available for the applications you have selected.
7204 World Community Grid 7/17/2016 9:41:05 AM Tasks are committed to other platforms
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


[Jul 17, 2016 1:44:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
ca05065
Senior Cruncher
Joined: Dec 4, 2007
Post Count: 328
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Limited workunits for Help Stop TB

I have been receiving the same messages for 2 days now. The techs should be in by this time tomorrow so hopefully something will be corrected.
[Jul 17, 2016 2:06:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
roundup
Veteran Cruncher
Switzerland
Joined: Jul 25, 2006
Post Count: 838
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Limited workunits for Help Stop TB

I continue to get the message below and no tasks. Linux 64 bit.
...
7204 World Community Grid 7/17/2016 9:41:05 AM Tasks are committed to other platforms

Same here, also on Linux 64. sad
[Jul 17, 2016 4:33:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Eric_Kaiser
Veteran Cruncher
Germany (Hessen)
Joined: May 7, 2013
Post Count: 1047
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Limited workunits for Help Stop TB

Another weird observation on my linux servers.
I have only hst1 wu available on my servers but when a wu finishes none of the available hst1 wu will be started causing the servers to idle...
----------------------------------------

[Jul 17, 2016 5:33:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: Limited workunits for Help Stop TB

That's probably because of the outrages memory usages. It's puzzling how a report of 15GB memory use comes around when tasks are supposed to pause when any of 3 'bounds' are exceeded e.g. these are the UGM limits:

<rsc_fpops_bound>684357252641820.000000</rsc_fpops_bound>
<rsc_memory_bound>162144000.000000</rsc_memory_bound>
<rsc_disk_bound>524288000.000000</rsc_disk_bound>
[Jul 17, 2016 5:46:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
widdershins
Veteran Cruncher
Scotland
Joined: Apr 30, 2007
Post Count: 674
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Limited workunits for Help Stop TB

Did you check the processes to see how much memory it's trying to use. That was a problem I had with a few WU's in the last couple of days. The WU would start memory required would balloon up until it hit the BOINC limit and sit there saying it needed more memory.

As other WU's finished that rogue unit would hoover up the memory that had just been released by the completing units as well. But that prevented other units from running as there was then no memory at all left for them. Eventually the number of simultaneous WU's would drop off until there was just the one rogue unit sitting there using all the memory and still not running.

Edit: Sek beat me to it! biggrin
----------------------------------------
[Edit 1 times, last edit by widdershins at Jul 17, 2016 5:54:35 PM]
[Jul 17, 2016 5:53:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 752   Pages: 76   [ Previous Page | 30 31 32 33 34 35 36 37 38 39 | Next Page ]
[ Jump to Last Post ]
Post new Thread