Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 3315
|
![]() |
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Just got ARP1_0001465_80_1
|
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12435 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thank you, Robert.
Another early one and the results being returned are still going up. 080 indicates we are at about 43.7%, but for this month I am assuming 2 generations behind to allow for the stragglers, so 42.6%. The latest interval is just 2.43597 days and the 10-interval average is 4.96797 days. This is slightly up due to a blip 10 intervals earlier. The end date forecast is still December 2022, however, if this rate continues, the forecast would be April 2022. I would expect the next generation to start about 26 July. Mike |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks to everyone who has upped their contribution to the project. Following the article last week you have reduced the backlog of results ready to send from over 15,000 down to about 2,500. This has been a very nice improvement and at the moment we are running at a pace to complete a generation every 5.1 days. Thanks to everyone running the project!
----------------------------------------Due to the decrease in backlog, we will see a bit of spread in the distribution of jobs across the leading generations. This means that over the next week or two we will see the next generation emerge faster. For example, the first generation 79 workunit was created at 07/18/2021 14:21:26 UTC and the first generation 80 workunit was created at 07/20/2021 23:37:16 (about 2.5 days later). It looks like we will see the first generation 81 workunit get created in in the next 24 hours which will only be 3 to 3.5 days since gen 80 (note that it will take a little bit for the first copies of it to be sent out since the front generation always has the lowest priority). This should stabilize at some point (where it stabilizes will depend on the size the backlog). As noted, there are a total 35,609 units being analysed (which yield about 2.2 results per day for users to process) and we are currently completing about 7,000 units/day (15,000 results/day) on average. 35,609/7,000 = 5.087 days per generation which is what my estimate is based on. [Edit 1 times, last edit by knreed at Jul 24, 2021 5:34:18 PM] |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12435 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Kevan
It might be better to say that there 2 x 35609 copies available for each generation because the project statistics page shows over 16,000 results being returned each day at present. As shown in my regular postings, which are based on when crunchers report seeing their first units of a new generation, the last 2 intervals have been about 2.5 days which matches your creation interval. It would seem from your comments that we are likely to run low on units, maybe because crunchers have switched away from MIP. Could we have an update, please, on the current state of the stragglers that you have posted about previously? Your message about that was very interesting. Have you analysed why they got left behind? Regards Mike |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Mike,
----------------------------------------Your estimates and predictions have been very accurate - I've been enjoying watching you track the progress and pace. Here is the latest distribution. count generation The stragglers were due to an error that we had to do some work to resolve and get moving again. This is resolved now and the stragglers are catching up. 6 of them unfortunately have to be re-run from the very start due to a combination of failures (we have resolved those failures and we are making sure that this does not happen again). Outside of the 6, the lagging generation has moved from generation 29 on July 17 to generation 32 so the laggards are catching up (i.e. the tail moved forward 3 generations while the front moved forward 2 generations). We expect this to continue and over time we hope to see only a spread of 6-8 generations between the front and the tail (although the 6 will take longer to catch up) The backlog had been steadily decreasing from the 15k down to the 2.5k I mentioned above. It dropped down to as low as 2k but it has started to creep back up over the last 18 hours and we are back up to 2.5k. Over the past 48 hours workunits have been completing at a pace that would let us get down to completing a generation every 4.5 days. Given that there is still a backlog we should be able accelerate this even faster. If we continue to see the backlog grow we will attempt to nudge a few more people to increase their allowed jobs on their devices up from 1. We had originally planned to post the news article and then send an email to volunteers already participating in ARP1 but the response to the article itself was bigger than we expected so we have held off on sending the email until we see the impact. Based on what we see the backlog at on Monday or Tuesday we might send the email to 10-20% of current ARP1 participants just to give a little bit more of a nudge to speed up the project and see if we can get the backlog down to around 500 - 1000 results ready to send. Our goal is to have steady supply and minimal backlog and we are almost there. [Edit 1 times, last edit by knreed at Jul 24, 2021 5:53:19 PM] |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12435 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thank you, Kevan
Your previous post of data had an earliest generation of 017 so we have caught up 15 generations while moving on 10 (ignoring those 6 needing a restart). That rate would suggest we will be up-to-date by generation 114, about the end of this year. However that does not fully allow for the additional crunching power inherited from MIP. If you do send out an email I would suggest that you incorporate some of the suggestions that I have been making to restrict to a maximum of 50% of threads and to keeping a low cache which improves turnaround and minimises 'No Replies'. Regards Mike |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Kevin, Please post you findings on Monday or Tuesday. If needed, I can throw about 200 threads at the problem.
|
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12435 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Kevan
I followed your simplistic view of comparing the current generation with the earliest laggard in agreeing the catch-up. However, if you multiply the numbers in each generation by the number of generations that the generation is behind, which calculates the outstanding work to be done, we are now further behind by about 15%. There were 106,364 x 2 units to be crunched which has increased to 121,394 x 2. Entity, I believe they need all the help that they can get. Mike |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I didn't wait. I already moved 80 cores over and another 128 coming shortly
----------------------------------------Problem is, I seem to be getting mostly gens 78, 79, and 80 which tells me the older generations are already out in the wild since they have priority. Adding more cores looks like it might actually extend the range between the front and back generation. [Edit 1 times, last edit by Former Member at Jul 24, 2021 9:25:15 PM] |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12435 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
entity
That might be so at the moment, but they are on 8-day deadlines whereas the earlier ones are on 4.5-day deadlines so should reappear more frequently. The earlier ones also have a higher priority so should be turned around faster. There are 5302 pre-077 out there compared with 30307 077-081 but they have many more generations to get through (35227 compared with 86167) which redresses the balance to some extent. A generation is 35609, but a generation can be completed in 4 days at 18000 results per day. The earliest stragglers have 80 generations to get through so are likely to take 3 months even if they get through 1 generation per day. That is most unlikely as both copies have to validate before moving on. Your extra machines will help the project to finish sooner, but might cause a local heatwave! Mike |
||
|
|
![]() |