Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 65
Posts: 65   Pages: 7   [ Previous Page | 1 2 3 4 5 6 7 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 8438 times and has 64 replies Next Thread
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Validator not running...or something else?

@Maverick12
I'm not too versed in the Result Logs, but why are the Checkpoint dates in March of 2019?
The times quoted for the checkpoints are the times in model space; the example you quoted was simulating 24th & 25th March for its section of ground.

Each generation of ARP1 tasks simulates two days, and your example is from generation 133. As an example, one of my systems has just processed ARP1_0014612_133_2 which has the same 8 checkpoints

Cheers - Al.
[Sep 16, 2022 3:19:31 AM]   Link   Report threatening or abusive post: please login first  Go to top 
spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 274
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Validator not running...or something else?

Signs of movement: I just checked in on my ARP results, and after days of having nothing but "Pending Validation" units stacked up, there are two units showing up as valid now, and three new in-progress units. Hopefully the backlog will start getting cleared.
[Sep 16, 2022 1:41:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
CurtisNewton
Cruncher
Joined: Feb 24, 2008
Post Count: 25
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Validator not running...or something else?

Can confirm. My WU's from monday are validated now.
[Sep 17, 2022 7:00:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2155
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Validator not running...or something else?

Signs of movement: I just checked in on my ARP results, and after days of having nothing but "Pending Validation" units stacked up, there are two units showing up as valid now, and three new in-progress units. Hopefully the backlog will start getting cleared.

I haven't seen any new in-progress units yet since the 13th of September, here's the latest one:
workunit 161991181
ARP1_0030687_134_0  Linux Ubuntu  In Progress           2022-09-13T01:29:33  2022-09-19T01:29:33
ARP1_0030687_134_1 Fedora Linux Pending Validation 2022-09-13T00:22:49 2022-09-16T18:06:14


I keep on receiving resends since then at a quite low pace and I'm even seeing tasks that are still Waiting to be sent, 23 in total;
two examples of that, the latest and the oldest one:

<9> ARP1_0017181_133_0 Linux Ubuntu No Reply 2022-09-10T16:05:42 2022-09-16T16:05:42
<9> * ARP1_0017181_133_1 Fedora Linux Pending Validation 2022-09-10T15:14:44 2022-09-13T06:55:59
<9> ARP1_0017181_133_2 Waiting to be sent

<3> * ARP1_0008309_132_0 Fedora Linux Pending Validation 2022-09-09T01:26:59 2022-09-12T19:44:05
<3> ARP1_0008309_132_1 Linux Ubuntu Error 2022-09-09T02:13:05 2022-09-09T03:25:03
<3> ARP1_0008309_132_2 Linux Ubuntu Error 2022-09-09T05:31:17 2022-09-10T07:06:53
<3> ARP1_0008309_132_3 Linux Ubuntu Error 2022-09-10T07:22:08 2022-09-13T07:22:22
<3> ARP1_0008309_132_4 Waiting to be sent

The asterisks denote they are my results.

The number of Pending Validation ARP1-tasks for my account is 163 at the moment and it is slowly decreasing. So, validations are taking place, delayed.
[Sep 17, 2022 2:42:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 951
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Validator not running...or something else?

I wish they would load the hopper up with some new ARPs. I am seeing some progress on the validation side. It just validated a WU with 2 results from Sept 17 (yesterday). I still have 2 from the 18th with 2 or more results waiting to be looked at.

I'm getting the odd resend here and there, but not enough to keep my small machine happy. I have a second machine I'll fire up once they can send us more WUs.
[Sep 18, 2022 11:34:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Maverick12
Cruncher
Joined: May 1, 2007
Post Count: 7
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Validator not running...or something else?

Still have 191 WU's Pending Validation. I don't get it. Some of them have a quorum of 3 and 4.
[Sep 19, 2022 4:12:31 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Validator not running...or something else?

I wish they would load the hopper up with some new ARPs. I am seeing some progress on the validation side. It just validated a WU with 2 results from Sept 17 (yesterday). I still have 2 from the 18th with 2 or more results waiting to be looked at.

I'm getting the odd resend here and there, but not enough to keep my small machine happy. I have a second machine I'll fire up once they can send us more WUs.
Unixchick - I sympathize; something definitely isn't quite right with ARP1, and I wonder if they've deliberately turned off ARP1 work unit generation for now as part of an effort to clear up whatever has caused the large numbers of results that seem to be getting stuck with every non-error return marked as Pending Validation (PVal jail!...)

All my returns up to 2022-09-07 have validated (eventually) However, I returned 11 results on 2022-09-08 and they are all still in PVal jail (most of them for 8 or more days!) -- the situation is similar for the 69 results returned since then, only 8 having validated during that time - I dread to think what the results pages of some of the heavy hitters might look like :-)

Without sysadmin access(!) and/or details of troubleshooting efforts, we can only speculate. And at present any time spent giving us a detailed enough "what the problem might be" report would be better spent solving the problem ;-)

Cheers - Al.

P.S. The PVal jail problem isn't necessarily a Validator issue (despite what the results pages might suggest), as the way the validator picks up work units to look at depends on a flag set by the transitioner, whereas the results pages just look at the various state conditions of individual results and don't know whether that flag has been set or not!... So if the transitioner has got a bit blocked up with respect to ARP1 tasks, this sort of issue could appear (and probably won't resolve without intervention)
[Sep 19, 2022 4:17:52 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 951
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Validator not running...or something else?

Thanks Al for the explanation.Hard to say which bit transitioner or validator is not working properly.

I now have a couple of resends sent to me to chew on, so my machine will be happy for today.
[Sep 19, 2022 2:16:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12359
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Validator not running...or something else?

ARP is not turned off because I have just received 8. However, I still have a stack Pending Validation including MCM.

Mike
[Sep 20, 2022 1:08:33 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Validator not running...or something else?

ARP is not turned off because I have just received 8. However, I still have a stack Pending Validation including MCM.

Mike
Mike, I think you left the word 'now' out of that statement -- a lot can change in 24 hours[1] :-)

Yes, it appears to be sending out tasks for new work units again - I got 5 new ones at about 22:00 UTC on the 19th, the first _0 or _1 tasks I've received in nearly a week. I'm not sure whether it's a blessing or a curse whilst we still seem to have a PVal jail problem for ARP1 results... (I also got a retry because of yet another task that failed to download the files...)

Hopefully, once they can get more servers active, they'll be able to adjust the number of validators and transitioners they run if that will help with the PVal jail issue; all speculation, of course...

Cheers - Al.

[1] Without access to the system logs we can't tell what is and is not running, so we have to settle for what we can deduce from what we can see, and evidence suggests that we were only getting retries for the last week or so...
----------------------------------------
[Edit 1 times, last edit by alanb1951 at Sep 20, 2022 1:51:44 AM]
[Sep 20, 2022 1:49:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 65   Pages: 7   [ Previous Page | 1 2 3 4 5 6 7 | Next Page ]
[ Jump to Last Post ]
Post new Thread