Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 3219
|
![]() |
Author |
|
Jim1348
Veteran Cruncher USA Joined: Jul 13, 2009 Post Count: 1066 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
And now MIP is going down, or migrating in-house to be more correct.
https://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,43484 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
in the last day i got a 046 and a 055 work units so they are still sending out older work units and has anybody gotten a 071 work unit yet
|
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12367 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The 070s appeared early. On previous occasions when that happened, the next one appeared when it would have been due if the previous one had not appeared early. Expect 071 about 12 June.
Mike |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The project has 35,609 "lines" of work. Only one workunit can be in progress at a time for each line. Once that workunit finishes, we are able to take that result and send out the next "generation" of work for the line.
We had a couple of issues with the project. There were about 650 lines that were stopped due to a issue that we have now resolved. Some of these lines had been stopped as far back as generation 15. All but 7 of the lines are back running again and moving forward. You are seeing the older generations due to these lines being put back in motion. Here is the current distribution of how far along each line is:
As noted in this thread, lines that are behind get a boost to catch up by being marked as "need reliable". This results in them being sent to devices that have proved to be reliable in the past and are able to return the result quickly. They also have shortened deadlines. We had been marking any line that was more than 2 generations behind as "need reliable" to catch it up. However, this was creating too many jobs that needed to be sent to reliable hosts and as a result, it was actually slowing things down because there weren't enough 'reliable' hosts to run the jobs relative to the number jobs that needed reliable hosts and thus regular hosts were blocked at times from getting regular jobs. We have changed this now so that any line that is more than 3 generations behind gets the boost. Based on the current distribution of generations, this means that any job that belongs to generation 67 or earlier will marked "need reliable". Finally, there was about a 3-day backlog of jobs to be assigned. This means that once a workunit finished, there a 3-day delay before the next generation for the line was sent out. We have increased the weight of the project so that more jobs are going out. At this time there is about a 15 hour backlog. Over the next 3-4 days this backlog should be reduced to 0. All of this is to say that we have been doing some work to get the project moving faster and this work should explain most of the comments that I saw above. At this time, on average, about 19% of the lines complete a generation each day. This means that we should be advancing to the next generation about every 5.5 days. However, with the higher than normal "catch up" jobs there has been a delay in assigning generation 71 jobs. Right now the generation 71 jobs represent about 70% of the 15 hour backlog. I expect that we will start to see those go out in the next day or two. We will continue to advance to the next generation at a slower than normal pace for the next 7-10 days and then we should see expect pacing resume. |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12367 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thank you for the illuminating explanation. We now know why these 'stragglers' have suddenly appeared and what is being done to clear up the problem.
As I understand it each 'line' correspond to a specific 3 km x 3 km patch of sub-saharan Africa. Each unit is duplicated so each generation has 71,218 units to be completed. There are 106,364 * 2 = 212,728 units for us to catch all up, so it would take us about 2 weeks. However, for instance, the one from generation 017 would have to go out and be returned 45 times to 2 people each having 3.5 days in which to return it and the slower of each pair would determine how long each iteration will take to complete. I reckon about 3 months could be needed to catch up that patch. The others would need progressively less time. Mike |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
There are 106,364 * 2 = 212,728 units for us to catch all up, so it would take us about 2 weeks. However, for instance, the one from generation 017 would have to go out and be returned 45 times to 2 people each having 3.5 days in which to return it and the slower of each pair would determine how long each iteration will take to complete. Hopefully, the good news is reliable hosts return the work well within the 3.5 day deadline which would shorten the catch-up period. My machines execute a WU in about 12 hours give or take a couple of hours. I have re-enabled the project. It's looking like I will return about 30 per day running 3 concurrently per host. |
||
|
JonU235
Cruncher Joined: Jan 17, 2020 Post Count: 8 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() |
71s in my queue
ARP1_0011404_071_0 ARP1_0007198_071_1 ARP1_0019713_071_0 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
i can confirm the 071 are out as i also just got one of the 071 work units
|
||
|
spRocket
Senior Cruncher Joined: Mar 25, 2020 Post Count: 274 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() |
in the last day i got a 046 and a 055 work units so they are still sending out older work units and has anybody gotten a 071 work unit yet Seeing an 040 in progress here is what brought me to this thread. Is this something that had to be re-run? I have one wingman, also in-progress. EDIT: Nice explanation of what's going on. I kind of suspected something like that was going on. [Edit 1 times, last edit by spRocket at Jun 11, 2021 11:47:46 PM] |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12367 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
With 071s flowing, I would normally say that we are at about 38.8% of the way through the project, but the figures for stragglers are the equivalent of 3 generations, so 37.2%.
Mike |
||
|
|
![]() |