Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 3181
|
![]() |
Author |
|
catchercradle
Advanced Cruncher Joined: Jan 16, 2009 Post Count: 125 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Well I finally got 2 tasks The only way I am keeping something going all the time is running both a Windows and a Linux VM in my Linux host. Even with three clients running on the physical machine, I am often down to just one task running. |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12345 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I am keeping going by having only half my threads on ARP1 and the other half on MCM1.
As MCM1 units are much shorter than ARP1 units it means I am requesting work more often so more likely to hit a release of ARP1 work. Also it means I crunch ARP1 units faster. Mike |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1947 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The supply of ARP1 WUs this morning is less than half of what it was in the last few days, is there a known problem or just something brewing up ready for the coming weekend?
----------------------------------------![]() Ralf ![]() |
||
|
catchercradle
Advanced Cruncher Joined: Jan 16, 2009 Post Count: 125 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I am keeping going by having only half my threads on ARP1 and the other half on MCM1. As MCM1 units are much shorter than ARP1 units it means I am requesting work more often so more likely to hit a release of ARP1 work. Also it means I crunch ARP1 units faster. Mike I have not managed to get above 7 threads running in the past few days despite having three clients running on the physical machine. I guess running MCM1 with an app config to limit the number of threads running them might help. |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12345 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have 8 threads on each machine and use app_config to restrict ARP1 to 4 threads on each machine. That way if either has less than 4 ARP1 units, it will fill up with MCM1.
As MCM1 units complete in about a tenth of ARP1, they request new work 10 times as often including any ARP1 that may be available. Mike |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12345 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
This is a big guess of mine.
----------------------------------------I suspect that Krembil are releasing units on the following basis: As units are validated they are converted into the next generation and buffered. They close the buffer to new entrants and start a new buffer, They send out out the units from that first buffer in generation order until the last they have buffered in generation 141, They start again with the next buffer, and so on..... This gradually closes up the extremes to the leaders. However, there are currently 82 extremes (including the 3 ultras) which are not appearing in state.txt although they are in generation.txt. That number is 418 for accelerated but an extra 52 in normal units. These seem to be stuck units which are not closing up. I could very well be wrong, so if anyone has any better ideas, please post them. Better still would be for Krembil to say what is actually happening. Mike [Edit 1 times, last edit by Mike.Gibson at Feb 14, 2025 9:39:09 PM] |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 943 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Mike,
----------------------------------------Looks like a reasonable [and feasible] summary; to me, the interesting part would be the fine detail of the stages between work-unit assimilation and new work-unit number allocation, and whether some sort of [standard] BOINC priority tagging is used to give precedence to tasks independent of WU number order! It would indeed be nice to know the entire process, though I do wonder if the WCG Tech Team is still at the point of "We just use what IBM provided" for some of the bits and pieces that make use of non-BOINC mechanisms, in which case we are unlikely to get any enlightenment :-( By the way, regarding the 448 units that don't show up in the state.txt count: there were only two missing units when the system went into the "no storage" 2023-2024 hiatus, but when they started work on getting ARP1 restarted after that the count went to 446 almost at once [2024-07-18] (presumably because of whatever WCG did to get things ready). When work actually started to flow again [2024-10-31?] there was a period of a week where Normal work-units completing were rapidly re-categorized as Extreme or Accelerated, not really surprising given what appeared to me to be a pre-hiatus state.txt imbalance toward Normal (implying some units hadn't run for a long time). Within a week [2024-11-04] a further two WUs "disappeared" and the count has been stuck at 448 ever since! It would be interesting to know whether those items are all [deliberately?] blocked or whether they have dropped into a data hole that has something to do with how the ARP1 restart happened... [Edited to add the below] There was a similar (but much larger) mass disappearance of 6423 units from state.txt on 2022-11-01 -- apart from a very small change that appeared to coincide with them restarting the three ultras early in 2023, things remained unchanged until 2023-05-19 when suddenly there were only 2 missing from state.txt and that stayed the same until the next hiatus (as mentioned above). According to the Project Statistics there was a massive increase in work returned around that time, so some sort of blockage was definitely dealt with, presumably about a week earlier given the deadline on Normal tasks... Again, was the blockage because all those units had genuinely ended up in an Error state or was it a non-BOINC database issue (or was it a mixture of the two!)? Cheers - Al. [Edit 1 times, last edit by alanb1951 at Feb 15, 2025 1:33:41 AM] |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12345 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Al
----------------------------------------It can only be pure speculation on our behalf as we don't have sufficient information. However, it seems to be getting clearer where the 82 missing extremes lie. It looks as though most of the remaining generations up to and including generation 120 (20 units) are involved with more becoming visible daily. Mike [Edit 1 times, last edit by Mike.Gibson at Feb 15, 2025 3:41:24 PM] |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12345 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Sunday Report
ARP seems to be running smoothly but with only a few units coming out. I haven't seen any for the last 30 hours. This report has been rebased since the January restart. Any forecasts will be based on average throughput for the weeks availailable until 5 weeks are available whence it will be based.on the last 5 weeks. Next week will be the 5 week. All classifications have stayed put.. There are still 3 ultras in generations 21 & 22. There are 18 units seemingly stuck in generations 104 to 121. There are 61 other extremes missing from state.txt presumed stuck. There are now 337 extremes in generations 115 to 131. There are now 550 accelerated units in generations 132 to 136 of which 417 are missing,presumed stuck. There are 34,719 normal units in generations 137 to 146 with 51 extras listed . The highest generation to have had validations is still 141. 20,810 units have validated this week. Based on the 4 weeks, we would complete ARP1 in late 2026 but there are fewer units currently being issued. Mike |
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 940 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() |
I am only getting resends in ARP right now. I like your theory Mike on how they are sending out ARP WUs. I think it is a good plan which keeps the extreme and accelerated units moving, but I think they need to throw more of the 141s into the mix to keep us at 7000 results returned like has recently been the "norm" (just my opinion from eyeballing the stats, nothing mathematical)
|
||
|
|
![]() |