Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 109
|
![]() |
Author |
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Unrelated, though my 64 bit client produced again a perfect result, the funny with the negative credits has apparently something to do with... mostly 64 bit clients generating corrupt output files. Noted before by the tech, the 2 minute issue has the attention, but the focus remains first on getting 2 projects launched.
----------------------------------------Personally, I'd go for a project mix also because HPF2 actually belongs to the more cpu heavier. [ot]Currently have snug up to 3 badge simultaneous upgrades (now within 10 days, 10,8,4 days to be exact), trying to time this so they fall on the same day (an outside chance I even get a fourth, virtual, to happen for an anticipated level, 11 CPU days out). Why it's done like that... because I can :D [/ot]
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
Hypernova
Master Cruncher Audaces Fortuna Juvat ! Vaud - Switzerland Joined: Dec 16, 2008 Post Count: 1908 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
As I had foreseen, now that I am crunching HPFP2 again like in the past I get errors in large quantities. All these errors have CPU times of 0.02 Hours with credit claims of 0.6 points or similar.
----------------------------------------No big impact except that these error WU's occupy bandwith, memory and a little processing time for nothing. If we consider across all machines that crunch this project on WCG it probably does drag the overall efficiency down. But I undertsand there is no solution. So keep crunching until Sapphire. ![]() |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Hypernova,
----------------------------------------If the 2 minute fails grow too large in numbers within 24 hours (UTC timekeeping) you'll be facing quota dry-up, so be sure to continue to return valid results in-between.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
Hypernova
Master Cruncher Audaces Fortuna Juvat ! Vaud - Switzerland Joined: Dec 16, 2008 Post Count: 1908 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Here is the status as seen with Result Status for HPFP2:
----------------------------------------Total WU = 738 split as: In Progress 414 Valid 9 PV 149 Error 166 Hope this will be ok. I would hate stopping this project without getting to Sapphire. ![]() |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Tonnes of errors indeed but what you have there looks like already one-third towards Ruby.
----------------------------------------The rule works like each error counts as quota 1 reduction and eventually even if good results will double the quota again there the hard daily of 80 per core or whatever the number is. You're on about > 50% error out rate. Having a buffer in this particular case helps to ensure you keep crunching.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
Hypernova
Master Cruncher Audaces Fortuna Juvat ! Vaud - Switzerland Joined: Dec 16, 2008 Post Count: 1908 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Here is an update of Result Status for HPF2:
----------------------------------------Total WU : 1384 Errors: 422 Valid: 116 PV: 374 In Progress : 472 The ratio Valid/Error is 27.4% It is improving compared to yesterdays 5.4%. I suppose this is due to the lessening effect on the very high quorum on these WU's. But still 422 errors (over two days) of 2 minutes CPU each sums up to 14 hours CPU lost. This is equivalent to 7 hours per day and across 10 machines that is 42 minutes/machine/day. This should not worsen as ratios should stabilize over the next two days. This CPU time loss is still an acceptable price to pay for beautiful gemstones like, Ruby, Emerald and Sapphire. ![]() ![]() |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Hypernova, count the PV as valid, near guaranteed. Then 490:422 makes it presently a ratio of 54:46, better than 50% valid. Good enough to keep running.
----------------------------------------Watch out for those notorious and rare HPF2 jobs that loop endlessly. See a task with a run time / % progress ratio that's abnormal, suspend the client with LAIM (Leave application in Memory) OFF and resume after 30 seconds so the job unloads from memory and resumes from last checkpoint. 99.9999999% it finishes then properly and faultless. BOINCview used to have a warning mechanism for low progress jobs and think BOINCTasks has it too but have not toyed with it.
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Mar 3, 2010 9:19:49 AM] |
||
|
Hypernova
Master Cruncher Audaces Fortuna Juvat ! Vaud - Switzerland Joined: Dec 16, 2008 Post Count: 1908 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
count the PV as valid, near guaranteed Excellent!!! La vita e bella. ![]() ![]() |
||
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3715 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I am wondering if we have ever seen such errors in Linux or Mac machines? Indeed my question tends to give my answer: I have the feeling that all these errors have been reported by Windows (all flavors) users.
----------------------------------------I have run about 50 CPU days of HPF2 WUs recently in order to get the emerald badge, most under Ubuntu 64 and the rest (<20 %) under XP 32 and the only error I have seen was under XP 32 in the quad. Could this be just another general Windows flaw, or a not-so-good library used only by Windows jobs? |
||
|
Hypernova
Master Cruncher Audaces Fortuna Juvat ! Vaud - Switzerland Joined: Dec 16, 2008 Post Count: 1908 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
When I check across my machines this is what I see:
----------------------------------------All machines are W7 64 bit. All motherboards are the same, memory is also the same. CPU's differ and are i7 920, 950 or 975EE. All CPU's are overclocked at similar levels. Errors are produced by the i7 920 and 950. The i7 975EE runs at 3.6 Ghz and generates one error per day. The 920 produces 26 error/day and the 950 between 21 up to 53 error/day depending on the machine. So it is really impossible to see a specific pattern. It's a maddenning story. ![]() ![]() ![]() ![]() |
||
|
|
![]() |