Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 3313
|
![]() |
Author |
|
Speedy51
Veteran Cruncher New Zealand Joined: Nov 4, 2005 Post Count: 1297 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
ARP1_0012554_141_2 https://www.worldcommunitygrid.org/contribution/workunit/285183449 has validated CPU time of 11.5 hours
----------------------------------------ARP1_0028260_141_3 https://www.worldcommunitygrid.org/contribution/workunit/285182768 has validated CPU runtime of 11.32 hours I will have no way of knowing but I hope this allows at least 2 more work units to be created ![]() [Edit 1 times, last edit by Speedy51 at Apr 25, 2023 7:31:20 AM] |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12435 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I suspect that newbies are not sent out automatically. Re-sends are.
It is a pity because that would mean them dribbling out instead of a major flood each week or so. That would also mean there would be less likelihood of http problems. Mike |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12435 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Sunday Report
There have been no extremes or accelerated units validated this week. 2,419 units have been validated this week Assuming that a full generation 182 will be the last, there are 1,638,067 units still outstanding. I will re-start my forecasting once output stabilises. The definition of normals, accelerated and extreme remained generations 144, 133 & 128, respectively. There are still 35 Extremes and 57 Accelerated units listed as none have moved. The numbers in their generations remain 2,200 & 3841. The extremes and especially the three ultras are getting further behind. Could we at least get those 3 ultras moving, please? Mike |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7695 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
They must be doing something to restrict the supply. For April 30 they are down to 27 work units returned. This is down from 79 the day before. From the looks of it, the may 1 number will be even smaller.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 979 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
They must be doing something to restrict the supply. For April 30 they are down to 27 work units returned. This is down from 79 the day before. From the looks of it, the may 1 number will be even smaller. Cheers Note that the average completion time tells us these are all tasks that had at least one retry because of a missed deadline; indeed, some of the retries probably missed their deadlines too :-) This suggests that at this point WCG are waiting for the current batch(es) to complete before triggering the next batch(es); however, it's unclear whether that is by choice or a "feature" of the relevant automation. It would be interesting to know what tools they have available for managing ARP1 work generation, batching and release. I can't believe Kevin and his colleagues automated the process with no management tools -- at a minimum they'd have needed a way to resurrect failed WUs (which could possibly also be used to help the ultra-extremes on their way?). Cheers - Al. |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12435 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Guys
----------------------------------------Based on my interpretation of the last few weeks flow and on my belief that Delft are not accepting any new reults but Krembil gained a small amount of file storage as a result of the recent hiatus, I am assuming that they are limiting the flow by issuing a batch then getting those results back or re-sending those units until they have all been returned and validated. Then they issue the next batch. They probably have re-sends automatically enabled with manual creation of new units. They seem only to be issuing normals when they should be issuing extremes (especially the utras) to enable them to catch up instead of falling further behind. The other problem that they are causing with the large batches is the http errors on download/upload. Mike [Edit 1 times, last edit by Mike.Gibson at May 1, 2023 6:51:20 PM] |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 979 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Mike,
Yes, I noticed the lack of any "how long it takes" information in the news post where they talked about moving data off to tape It would be nice to know such information, as it would clarify how much (or how little!) work could be returned to Delft on a daily (or weekly?) basis... As you say, resends are always handled without reference to the workload generator, as necessary files should already exist :-) -- retries for missed deadlines are handled by the transitioner, whilst retries for verification also [of course!] require the involvement of the validator. Whilst it may be possible to instruct the feeder to not issue any work for an application at all, for ARP1 that would be foolish in the extreme :-) As for why they're only issuing Normal units, that may be a function of the existing automation, which is why I mentioned the tools available. My suspicion is that there was a very large batch and it had exhausted its non-Normal content[1] quite early, partly because of the tighter deadlines. Whatever the case may be, the work sent out since early April certainly isn't doing Extremes and Accelerated tasks any favours, as less than 1% of the total units processed were for those categories :-( -- as Normal units only make up 85% or so of the population something isn't working as it might! Cheers - Al. [1] It's quite interesting trying to work out how many units constitute a "batch" (as distinct from what gets released at a time); for instance, since the restart there have been more Normal unit movements than there are normal units, which must imply at least two batches (though some could be left over unissued from before the crash...) It certainly appears that within a batch the units are (more or less) dealt out in ascending generation order, but possibly in quite large chunks (which would partly disguise that...) |
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 992 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() |
I got some new ARP WUs. I hope it continues, and Mike can give us updates with some encouraging progress.
|
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 979 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Until they eliminate whatever is causing the apparent failure to generate new work whilst there are still WUs "live" at various stages, I fear that progress will continue to be glacial.
As it didn't seem to behave like that in "olden times" I have to believe they'll eventually work out what the issue is -- the combination of various issues both at Delft and at WCG over the last months can't have helped any, and I wonder if something they did to limit what was being sent to Delft or to ease download problems has had an unfortunate side-effect :-( For now, I just hope I can keep a steady trickle of work without depending on the leavings of the folks (with caches that are probably too large?) who generate a bountiful supply of No Reply and Not Started by Deadline retries :-) -- It is frustrating to be able to guarantee sub-24 hour turnaround but not be able to get work (whilst seeing the effect of multiple deadline misses on the completion time statistics), especially with ARP1 being so dependent on fast turn-around to run properly!... Cheers - Al. P.S. The last few tasks of the last set took an average of 15+ days to complete -- that implies that the initial post-deadline retries were themselves deadline failures two or three times (unless something went wrong with the three-day setting for post-deadline retries... |
||
|
pwhidden
Cruncher USA Joined: Nov 17, 2004 Post Count: 32 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
OMG.... 3 ARP work units arrived. All 137 zero or 1.
----------------------------------------![]() |
||
|
|
![]() |