Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Member(s) browsing this thread: pwhidden , adriverhoef , MJH333 , catchercradle |
Thread Status: Active Total posts in this thread: 3312
|
![]() |
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
A download limit of 22 for a machine with 32 threads would not be sufficient. It would need to be at least 1.5 per thread to keep running during up- and down-loading. The current 64 limit would be more than enough, so scrap the "unlimited". Not on a 128 thread machine it isn't. I vote 'NO' on limits. There should be as few limits as possible imposed on those who wish to contribute. The only limits should be those that prevent a runaway situation like downloading 1000s of work units only to have them error out immediately. WCG controls the infrastructure and the researchers control the science and if they aren't complaining, let the contributors have all they want. I can run 128 ARP concurrently and they all return in less than 40 hours which is reasonable. Just because it isn't "optimal" in someone's opinion doesn't make it unusable to science. Each member can decide to run each project on their hardware as they see fit but they shouldn't dictate how others contribute on their hardware. Leave the 'UNLIMITED' option. On 128 or 256 thread machines, even the cache setting is becoming irrelevant. A one or two day cache results in the client hitting the 1000 work unit limit long before hitting the cache limit. [Edit 1 times, last edit by Former Member at Nov 21, 2020 3:35:03 PM] |
||
|
PMH_UK
Veteran Cruncher UK Joined: Apr 26, 2007 Post Count: 773 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
As there is a wide range of core count it would be better if the limit took account of that.
----------------------------------------Values could be something like: cores/2 cores-1 cores cores+1 cores*2 etc. as well as the absolutes we have now. Also project limit should be near projects available, and in same order. If limit selection allowed 0 only that section would be required. Paul.
Paul.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
A download limit of 22 for a machine with 32 threads would not be sufficient. It would need to be at least 1.5 per thread to keep running during up- and down-loading. The current 64 limit would be more than enough, so scrap the "unlimited". Not on a 128 thread machine it isn't. I vote 'NO' on limits. There should be as few limits as possible imposed on those who wish to contribute. The only limits should be those that prevent a runaway situation like downloading 1000s of work units only to have them error out immediately. WCG controls the infrastructure and the researchers control the science and if they aren't complaining, let the contributors have all they want. I can run 128 ARP concurrently and they all return in less than 40 hours which is reasonable. Just because it isn't "optimal" in someone's opinion doesn't make it unusable to science. Each member can decide to run each project on their hardware as they see fit but they shouldn't dictate how others contribute on their hardware. Leave the 'UNLIMITED' option. On 128 or 256 thread machines, even the cache setting is becoming irrelevant. A one or two day cache results in the client hitting the 1000 work unit limit long before hitting the cache limit. 1) The default in the device profile is unlimited at science level 2) There's an overarching rule to not assign more than 70 or 75 work units per core to a device, so effectively like 560-600 is the limit an 8 core device can have, if all are allowed to crunch. 3) Over-overarching is 1000 WU for a device no matter how large the core count. A 64 thread machine doing SCC will run out fast. Short for, total runaway is not possible, but for the occasion where due reporting and requesting timing sometimes a device manages to get a little more than 1000. |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12434 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Even a 128 core machine can still hold nearly 8 units per thread, which is plenty with the current availability of units.
I still reckon that any restriction should be in respect of the number of days cache and not actual numbers of units. Mike |
||
|
BladeD
Ace Cruncher USA Joined: Nov 17, 2004 Post Count: 28976 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
This could easily be fixed by first eliminating the Unlimited setting in Device Profiles The problem is not with the "unlimited" but with the cache setting. If some are using cache settings in excess of 5 days with ARP and they are not running 24/7 there could be a problem. The problem would disappear with a cache setting of 1 day which should be sufficient for most people as the supply of most projects is quite consistent on WCG. Outages are few and far between. Cheers You are seeing the outages at the server end. I would guess that outages (at least the offline ones) are greater at the client end. |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12434 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Reducing cache limits would result in fewer re-sends being required, reducing the numbers of units in PV status and reducing the numbers getting out of sync with the current batch.
Mike |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12434 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Now that 037s are flowing it means we are over 20% at 20.2% complete, based on the current assumption of 183 generations, once we have returned more 037s than there are under, incomplete.
Mike |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12434 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Now it is the turn of 038s so we will be at 20.8% complete, based on the current assumption of 183 generations, once we have returned more 038s than there are under, incomplete.
Mike |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12434 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Assuming that there will be 183 iterations and averaging the last 10 intervals, completion of the project could be about March 2023.
Mike |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12434 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
039s started flowing yesterday so I now expect completion in April 2023.
Mike |
||
|
|
![]() |