Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 52
Posts: 52   Pages: 6   [ Previous Page | 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 9959 times and has 51 replies Next Thread
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: ERROR: exit code 95 (0x5f)

Dataman,

No 10 of the 19 'inconclusive' meet following criteria

1. Have all the same control hash/checknumber
2. Do not all fill the 10 different parts that make up the whole.

It's very well possible that several of the same 1/10th segment agree, but the validation program at this time does not look at that.... it needs all 10 parts to be complete before it is able to dismiss the invalids.

Somehow, the way i understand it, the algorithm is not able to say e.g. 7 have the same overall hash and 3 don't, therefore only those 3 need a backup calculation. Probabilities would make it an extreme outside for that not to be true. Think knreed explained it somewhere in US-English.

Added 2 comments:

A. As for the task switching, it's odd within the same project (WCG), but presume that after it determined the deadline will be met (?), it went back to 'which job needs least time to complete' (another logic of BOINC). What was the remaining time on the DDDT when that happened? Watch that if you receive a batch of with exact same deadline (the maximum is 10), BOINC will do the jobs with shortest projected completion times first.

B. The initial replication number is i think not correct. It's 10. There's a known bug, where each extra copy adds to that figure.

Il Consigliere della Comunità
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 3 times, last edit by Sekerob at Oct 29, 2007 4:56:51 PM]
[Oct 29, 2007 4:34:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Dataman
Ace Cruncher
Joined: Nov 16, 2004
Post Count: 4865
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: ERROR: exit code 95 (0x5f)

Dataman,

No 10 of the 19 'inconclusive' meet following criteria

1. Have all the same control hash/checknumber
2. Do not all fill the 10 different parts that make up the whole.

It's very well possible that several of the same 1/10th segment agree, but the validation program at this time does not look at that.... it needs all 10 parts to be complete before it is able to dismiss the invalids.

Somehow, the way i understand it, the algorithm is not able to say e.g. 7 have the same overall hash and 3 don't, therefore only those 3 need a backup calculation. Probabilities would make it an extreme outside for that not to be true. Think knreed explained it somewhere in US-English.

Il Consigliere della Comunità


Thanks, as usual, Sekerob. That makes sense. It is an "interesting" way of doing things. I'll keep running them.

flag
----------------------------------------


[Oct 29, 2007 4:40:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: ERROR: exit code 95 (0x5f)

And another:

ach1_ 5_ 35_ 1--: 10.38 hours
[Nov 2, 2007 7:32:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: ERROR: exit code 95 (0x5f)

Two more crying

ach1_ 3_ 5_ 10--: 10.64 hours
ach1_ 6_ 68_ 2--: 9.32 hours
[Nov 5, 2007 2:54:01 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Dataman
Ace Cruncher
Joined: Nov 16, 2004
Post Count: 4865
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: ERROR: exit code 95 (0x5f)

What was the remaining time on the DDDT when that happened? Watch that if you receive a batch of with exact same deadline (the maximum is 10), BOINC will do the jobs with shortest projected completion times first.

Il Consigliere della Comunità


Sorry Sek, I missed your addendum. I did not record the time but I am rather sure the DDDT had >75% time remaining. Since then I have run a lot of them and have had no problems with them at all.

Cheers!

flag
----------------------------------------


[Nov 5, 2007 3:11:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: ERROR: exit code 95 (0x5f)

A couple more:

ach1_ 9_ 67_ 10--: 9.50 hours
ach1_ 6_ 26_ 15--: 10.72 hours
[Nov 28, 2007 4:56:07 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: ERROR: exit code 95 (0x5f)

Here are a few more:

ach1_ 9_ 31_ 6--
ach1_ 16_ 79_ 10--
ach1_ 14_ 50_ 28--
ach1_ 12_ 56_ 0--
ach1_ 9_ 15_ 17--
ach1_ 9_ 67_ 10--
[Jan 22, 2008 2:46:33 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: ERROR: exit code 95 (0x5f)

Here one more:

WU ach1_14_72

48 copies

45 Too Late
2 Error
1 No Reply
----------------------------------------
[Edit 1 times, last edit by Former Member at Feb 16, 2008 10:46:55 AM]
[Jan 22, 2008 8:28:12 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: ERROR: exit code 95 (0x5f)

hi h.hett,

this thread is/was reporting jobs that went out with a bang and errors. so is yours the one of the 'too late' case or the fail?

Got 7 in a row after months of infinite repeats that none were available and being send alternate work. 5 validated so far, 1 is pending and 1 is still crunching, so don't quite fathom why they crash so frequently for esoteric17. Do note though that like the cancer jobs they generate massive amounts of page faults (6.5 billion on 1 job) and run very slowly on my what i thought was still a fair system (P4 with 512kb L2 cache). It only improved somewhat when allowing more ram (1.3gb when in use and increasing the "write to disk" to 15 minutes.

Found a few strange oddities about the distribution algorithm which i'll report to the technicians. The deadlines for backup work are not consistent for one. The last rush job got finished about 30 minutes before deadline and had to be uploaded manually as the client did not volunteer this.

ttyl
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jan 24, 2008 5:07:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: ERROR: exit code 95 (0x5f)

Got 7 in a row after months of infinite repeats that none were available and being send alternate work. 5 validated so far, 1 is pending and 1 is still crunching, so don't quite fathom why they crash so frequently for esoteric17.


Not just me! (See post earlier by uplinger saying the same thing happened to him). And I may be giving the wrong impression here - I do have a number of hosts and I have many AC@H which are valid. I'm just noticing them more because statistically I receive more WUs than the average user, so I'll see more errors (I am one of the few with a AC@H badge wink )

116 AC@H WUs crunched since 11/13:

9 error (6 of which are the exit code 95/line 296 of wrf_io.f error)
96 valid
1 in progress
9 invalid
1 inconclusive

So, this error is only 5% of my WUs, which is why many folks wouldn't see it - if they don't get many of these in the first place, it's not likely it will fall into the 5%.

Not the hugest issue and it doesn't bother me - I'm just reporting the WUs as they come along smile
[Jan 24, 2008 7:56:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 52   Pages: 6   [ Previous Page | 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread