Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 129
Posts: 129   Pages: 13   [ Previous Page | 2 3 4 5 6 7 8 9 10 11 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 14716 times and has 128 replies Next Thread
dondee
Advanced Cruncher
Joined: Jan 16, 2006
Post Count: 100
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No work?

I have quite a few scc claiming to have been sent to my machines, but I can't find them,
Is this normal?
[Aug 22, 2023 3:17:39 AM]   Link   Report threatening or abusive post: please login first  Go to top 
dondee
Advanced Cruncher
Joined: Jan 16, 2006
Post Count: 100
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No work?

The missing wus have downloaded now.
I guess there is a delay of a few hours before available.
[Aug 22, 2023 6:05:15 AM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1951
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No work?

Is the SCC1 well running dry, now that we got an ample supply over the last weekend?

It seems all I get since early this morning is MCM1 instead... sad


Ralf
----------------------------------------

[Aug 23, 2023 8:52:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
MJH333
Senior Cruncher
England
Joined: Apr 3, 2021
Post Count: 268
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No work?

Hi TigerLily,
I stopped getting new SCC work (other than resends) around 16:30 UTC yesterday (August 23). Have the current batches all been sent out, or is there an issue with the work generator?
Cheers,
Mark
[Aug 24, 2023 8:22:12 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2167
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No work?

Mark,
I stopped getting new SCC work (other than resends) around 16:30 UTC yesterday (August 23).

Same here.
I am seeing a large number of redistributed tasks from old workunits that still need to be run and validated. Looking at their original sent dates, some of them are really old and it looks like their redistrbuted task was "Waiting to be sent", so not assigned to a wingman yet. They have been waiting in a queue, feeling lonely and abandoned. crying

Take this one for example, my latest downloaded task (as part of a workunit)(*1):
SCC1_0004169_MyoD1-C_86043_0  Linux Ubuntu  Pending Verification  2023-08-09T14:57:40  2023-08-15T14:20:57
SCC1_0004169_MyoD1-C_86043_1 Fedora Linux In Progress 2023-08-24T10:51:43 2023-08-27T10:51:43

The original was returned on 2023-08-15, that's 9 days ago, and the resend has been waiting to get assigned to a wingman until today (2023-08-24).

Adri
[*1] (This is refreshed each hour, so it is only a current snapshot.)
[Aug 24, 2023 11:08:35 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 972
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No work?

I've been handed a few retries over the last few days where the Validator requested a retry (rather than the Transitioner; read on...) and those seemed to get out quite quickly.

I haven't [recently] seen any SCC1 tasks that crashed (e.g. SIGSEGV) so I don't know whether retries for those have been stalling -- however, No Reply (and the "Not Started by Deadline" errors WCG doesn't label as such) retries have [as far as I can tell] all stalled. All those retry cases are transitioner-driven.

A common theme of quite a lot of the WUs I've seen where there has been a long period spent waiting to be sent is that the result that would have triggered a retry request went past the deadline (so No Reply, so retry) then returned an Error (Not started by deadline), a while later [sometimes as much as 3 days later!] than the deadline. This probably doesn't cause the stall, but is worth noting if trying to work out when the retries might finally escape...

I was hoping that perhaps when the existing new work ran down these marooned retries might get shaken loose because they weren't hidden by newer work, but I'm beginning to wonder if it'll take the transitioner saying "something should have happened by now - I'll give this a kick" before anything happens -- I note that quite a few of the retries that do escape seem to be about 6 days after one of the triggers that might have requested them...

[Adri - when I followed your link it took me to workunit 352298964 where a Windows task took over 10 days to get round to returning an error, and you seemed to get the retry about 6.5 days after the initial task would have gone No Reply]

STOP PRESS: I seem to have recently been handed a selection of initial retries for WUs where the original (singleton) WU missed the deadline. The Error state times were between 6 and 8 days after issue. Most of the retries seem to have been issued about 10 days after the initial deadline :-) so some more of the tasks with the longest delays are finally moving, akin to the one Adri mentioned...

Cheers - Al.

[Edited to add the STOP PRESS...]

[*1] As there are so many different ways a feeder can be configured, it's impossible to know whether some change there might have any influence on this (so no guesses as to why the tasks get marooned - the creation seems to be reasonably timely...)
----------------------------------------
[Edit 2 times, last edit by alanb1951 at Aug 24, 2023 5:17:53 PM]
[Aug 24, 2023 4:58:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2167
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No work?

Al, you wrote:
I haven't [recently] seen any SCC1 tasks that crashed (e.g. SIGSEGV) so I don't know whether retries for those have been stalling -- however, No Reply (and the "Not Started by Deadline" errors WCG doesn't label as such) retries have [as far as I can tell] all stalled. All those retry cases are transitioner-driven.

I can confirm that (at least) several of them stalled. Even two days ago, I've deliberately held on to 3 tasks to have them pass their deadlines and their duplicates — or 'retries', maybe the preferable term which I've seen been used by Al — are still Waiting to be sent:
 <3> * SCC1_0004195_MyoD1-C_74969_0  Fedora Linux  Pending Verification  2023-08-16T21:48:36  2023-08-22T22:09:43
<3> SCC1_0004195_MyoD1-C_74969_1 Waiting to be sent

<4> * SCC1_0004192_MyoD1-C_75077_0 Fedora Linux Pending Verification 2023-08-16T21:48:36 2023-08-22T22:21:50
<4> SCC1_0004192_MyoD1-C_75077_1 Waiting to be sent

<5> * SCC1_0004195_MyoD1-C_75347_0 Fedora Linux Pending Verification 2023-08-16T21:48:36 2023-08-22T22:08:30
<5> SCC1_0004195_MyoD1-C_75347_1 Waiting to be sent


Adri
PS Well done, Femke Bol!
[Aug 24, 2023 9:36:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12410
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No work?

I haven't seen any SCC units for 3 hours and before then only re-sends.

Mike
[Aug 24, 2023 10:40:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7675
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No work?

Well, there is something going on. One of my Linux machines got a whole rash of retries. Normally there is a smattering of them, but now over 50% of the SCC work units are retries. Whatever the log jam was, it seems to have broken sometime since about 12:00 UTC. as near as I can figure.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Aug 24, 2023 10:42:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1951
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No work?

Well, there is something going on. One of my Linux machines got a whole rash of retries. Normally there is a smattering of them, but now over 50% of the SCC work units are retries. Whatever the log jam was, it seems to have broken sometime since about 12:00 UTC. as near as I can figure.
Cheers
Well, I (tried to) posted yesterday around lunchtime (PST) that the SCC WU supply is running dry. Got a bunch of retries (on some hosts more than on others)) but otherwise it is for more than 27h now only MCM as far as new WUs go...

Just more of the same old same old... sad


Ralf
----------------------------------------

[Aug 24, 2023 11:17:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 129   Pages: 13   [ Previous Page | 2 3 4 5 6 7 8 9 10 11 | Next Page ]
[ Jump to Last Post ]
Post new Thread