Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 46
Posts: 46   Pages: 5   [ Previous Page | 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 120252 times and has 45 replies Next Thread
sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 50 hour WU?

For Reference, from My Grid, Result Status, you can see the work unit name and check to see if the wingman completed it.
I'm just saying when you report an error/problem it's worth providing some more details.
If you had first asked online, one of the CA's may have given you some advice such as, click on the running task, select properties and report the details, try closing Boinc, restart the system and see if the task runs properly, check to see if a wingman has had any success...
Occasionally this leads to the early identification of a problem and allows the techs or scientists to make improvements.
----------------------------------------
[Edit 1 times, last edit by skgiven at Aug 22, 2010 10:40:30 AM]
[Aug 22, 2010 10:39:39 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: 50 hour WU?

The Start Here FAQ Index "Problem Resolution section starts with a couple of 'duh' items... really duh and held to the highest KISS level. Booting removes so much crud that builds up in memory. Who remembers what was done 3 weeks ago when the system has been up for 4 weeks or what bad AV update got stuck in? Funny is, since installing Linux Lucid Lynx, I'm told to boot more often than Windows ever would. These kernel and security fixing patches just keep coming in. And you have to restart manually (just what I want)... not like Woz just doing it if you don't watch your step. Particularly when running CEP2 you want to choose the right moment (which is difficult on a hexcore :O)

Today, passing 500,000 validated results for the DDDT2 research (495,339 last night)... half a million... who'd have thought it possible 10 days ago :P
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Aug 22, 2010 11:15:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 50 hour WU?

My bad. I just assumed that since the other units were moving right along there was something wrong with that one. I'll be more patient if it happens again but that's the first time I had ever seen anything like that.
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


[Aug 23, 2010 10:38:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 50 hour WU?

DDDT2 6.17
ts05_d424-sdb004
Received 02/09/2010 18:58:59
Report deadline 12/09/2010 18:59:21
CPU time at last checkpoint 07:32:32
CPU time 07:42:41
Elapsed time 07:50:58
Estimated time remaining 20:50:03
Fraction done 6.00%
Virtual memory size 490.65MB
Working set size 4.28MB
Directory slots/13
Process ID 2712

XP x86
i7-920 stock

6% after almost 8h on an i7 ???
[Sep 3, 2010 6:23:26 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: 50 hour WU?

Like HPF2 it could be stuck in a loop. See FAQ's how to get out, which is effectively, unloading the task by switchng off LAIM and suspending the task and keep it there for 30 seconds. Confirm in Task Manager it has unloaded. Then unsuspend it. It will kick back to last checkpoint and is likely to finish normally. I've yet to see this with any job but the rare case of HPF2. Now since it uses loads of memory, that is the area to visit. Top regions of RAM hardly used, thus potential bad bits lurking there. A full memtest86 will tell. It's default listed in the Linux boot menu.

BOINCtasks has the most perfect monitoring function... the checkpoint column which shows running time since last cp and how many checkpoints since job was (re)started. The only job that has really long checkpoints near the end is CEP2 where 3-4-5 hours are observed. It's superior to that task properties function in the BM. Soon you can set the time before the field turns red. e.g. currently it is twice the duration of the last checkpoint or after 10 minutes
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Sep 3, 2010 6:42:32 AM]   Link   Report threatening or abusive post: please login first  Go to top 
sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 50 hour WU?

Turned Off "Leave application in memory when suspended" (LAIM),
Suspended Task (and waited 30sec+),
Watched Task Manager RAM in use graph fall,
Resumed Task,
It fell back about 10min and started running again.
Watched Task Manager RAM in use graph rise, but not as much as it was.
The elapsed time is rising, and the estimated time to complete is rising (I would expect that given it is based on the time so far).
Percent complete is now 6.166%; further than in my last post.
All good so far.

Much better than a restart.
Many Thanks,

- Hmmm, It's now at 7.166% complete after 7.11h,
Well it took 1h and 12min it increased from 6.166% to 7.166%
I have another similar task that is at 67% complete after 36min.
I let it run until it reached 7.5% and then restarted the system.
We will see how it fairs now (only using 7 threads, but no GPU tasks running). It was at 7.5% after 9h 33min. Now 9h47min and still at 7.5% so I think it is fair to say the task will take about 100hours or it has a bug.

Project Name: Discovering Dengue Drugs - Together - Phase 2
Created: 01/09/10
Name: ts05_d424_sdb004
Minimum Quorum: 2
Replication: 2

Result Name App Version Number Status Sent Time Time Due /
Return Time CPU Time (hours) Claimed/ Granted BOINC Credit
ts05_ d424_ sdb004_ 1-- - In Progress 02/09/10 17:59:21 12/09/10 17:59:21 0.00 0.0 / 0.0
ts05_ d424_ sdb004_ 0-- - In Progress 02/09/10 17:58:53 12/09/10 17:58:53 0.00 0.0 / 0.0
----------------------------------------
[Edit 5 times, last edit by skgiven at Sep 3, 2010 11:43:46 AM]
[Sep 3, 2010 9:50:38 AM]   Link   Report threatening or abusive post: please login first  Go to top 
GB033533
Senior Cruncher
UK
Joined: Dec 8, 2004
Post Count: 201
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 50 hour WU?

i think i may also have one of these long-running wus. usually these sr type c complete in a little over an hour, but this one has only reached 0.833% after an hour. the wu is...

ts05_ d424_ sr45b0_ 0-- In Progress 9/2/10 18:46:17 9/12/10 18:46:17

skgiven's is d424 as well.

i've done the 'turn off laim/suspend....' thing, but it hasn't made any difference. i'm tempted to abort it as i can't let it run for 120 hours....
----------------------------------------

[Sep 3, 2010 1:25:46 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: 50 hour WU?

OK, I've dropped note to techs. Suggest suspending these babies for now in case specific log info is needed or a dump.

thank for your patience.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Sep 3, 2010 1:51:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 50 hour WU?

Task suspended, will look in now and again to see if any more details are wanted. It's still 8days until it expires/gets re-issued.
[Sep 4, 2010 12:26:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: 50 hour WU?

Plz do not forget that suspended tasks stop work fetching for WCG, so now and then unsuspend to backfill the buffer. You might also withness an oddity... the TTC will lower as you are completing normal duration tasks.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Sep 4, 2010 1:49:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 46   Pages: 5   [ Previous Page | 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread