Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 2370
|
![]() |
Author |
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
As for hyperthreading imho it will make each unit run slightly longer but you will get more done overall. HT enabled will also make it run up to 12c hotter.
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
![]() ![]() |
||
|
Ingleside
Veteran Cruncher Norway Joined: Nov 19, 2005 Post Count: 974 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Well ... I think I'm on the safe side. The application has a 12 second "redial" limit, one can see this "Communication deferred 00:00:11" when pressing the "update" button in the Boinc manager. I use a "security factor" of 5, my script asks every 60 seconds for updates. Of course I know that this will put some stress on the WCG servers, but .... Yes, I can use any other number to wait for updates, but it would be nice when SekeRob or any other could give a hint. The boinc application also asks every 60 seconds for work ... at least for the first tries. Then the time between the updates seems to get longer and longer ...and the my crunchers will miss the raindrops from DDDT2. So the question is: how it is done right? Stephan Well, SekeRob has already given his "hint", for this part will only add that with 500k results/day, can on average expect 2-3 scheduler-requests/day per computer, while your suggested method is 1440 scheduler-requests/day per computer, or roughly 500x higher load. To handle this higher load, WCG must probably increase the enforced delay from 11 seconds upto 1 hour or something... Looking on the BOINC-client, with v6.2.xx and earlier clients, the exponential backoff was between 1 minute and 4 hours, and for scheduler-request the backoff was being reset after 10 tries. This is much too agressive, and the BOINC-client is basically DDOS-ing the project-servers after any outages. v6.6.xx and v6.10.xx did improve things somewhat, with the project-wide backoff on uploads or downloads in case of multiple upload/download-errors, and the new 1-minute to 24 hours deferral on failed work-requests. These changes was still not enough, and for this reason v6.12.xx "soon" to be released includes some more changes. The most notable is that on connection-failures, the exponential deferral is between 10 minutes and 12 hours, while for failed work-request the max starts at 10 minutes. Also, if not mistaken, for work-requests the actual is now between 0.5x max and 2x max... So, if assumes a computer uses 4 hours/task, and runs v6.12.xx, you'll have: Dual-core: Upto 4 deferrals after the "'max' 10 minute, double on successive errors"-rule, minimum will be after 5 minutes, 10 minutes, 20 minutes and 40 minutes. After 1 hour, the deferrals is reset, due to a task has finished. This means, BOINC-client will maximum make 5 scheduler-requests per task in this scenario. Your "trick" on the other hand with 1 scheduler-request per minute, will make... 120 scheduler-requests per task on the dual-core... Using the same 4 hours/task, depending on #cores, can make this small table: #cores - Normal - per minute - Excess load 2 - 5 - 120 - 24x 4 - 3 - 60 - 20x 6 - 3 - 40 - 13.3x 8 - 3 - 30 - 10x 12 - 2 - 20 - 10x So, depending on how many cores your computer(s) has, your "trick" gives an unneccessary increase in server-load between 10x and 24x... ![]() "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." [Edit 1 times, last edit by Ingleside at Jan 23, 2011 3:20:26 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I live in a loft (not super big) so am using laptops so the place doesn't always have to look like a machine room (when entertaining the few non-nerdy friends I have
![]() cheers, -j |
||
|
gb077492
Advanced Cruncher Joined: Dec 24, 2004 Post Count: 96 Status: Offline |
So, if assumes a computer uses 4 hours/task Not a good assumption, as this is DDT2... I just checked my result log and I have 4 validated WUs that each completed in less than 15 minutes. On an 8-way box that's a little higher request rate than you're allowing for. Indeed, I have an 8-way server that I can only control through the web pages and it ran out of work twice in the last few days while I had it set to run DDT2 only and a 2 day cache (it couldn't keep the cache full as soon as anything went wrong since the DDT2 WUs trickle out so slowly). With boxes which are not always-on I usually set the cache to less than one day. What you're suggesting says that I shouldn't do that. Methinks you're going to run very short of "reliable" boxes if this is implemented. Mike |
||
|
Ingleside
Veteran Cruncher Norway Joined: Nov 19, 2005 Post Count: 974 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Not a good assumption, as this is DDT2... I just checked my result log and I have 4 validated WUs that each completed in less than 15 minutes. On an 8-way box that's a little higher request rate than you're allowing for. Well, I didn't expressely state it, but my assumption was for a client that didn't manage to fill-up it's cache with DDDT2, so instead of sitting idle part of the time was filling-up with a mix of other work instead, and 4 hours/task will atleast be a good starting-point... But, if going to take the extremes, a slow dual-core that uses 20 hours/task, and can't get any work, will make 600 scheduler-request per task if asks once per minute, while with v6.12.xx, the max goes like 10 minutes, 20 minutes, 40, 80, 160, 320... Exactly how long the deferral will be for each failed request is more-or-less random, but chances are there's only be 5 or 6 scheduler-request before a task finish, for a total of 6 or 7. So, asking once per minute will give 100x higher load. For the 15-minute-tasks, a dual-core will possibly get a single extra scheduler-request off, so 2 requests/task, while once per minute is 7 requests/task. This is only 3.5x more frequently. For an 8-way and 15 minutes/task, there'll be a scheduler-request every 1-2 minutes regardless, so using a script is a waste of time. Indeed, I have an 8-way server that I can only control through the web pages and it ran out of work twice in the last few days while I had it set to run DDT2 only and a 2 day cache (it couldn't keep the cache full as soon as anything went wrong since the DDT2 WUs trickle out so slowly). With boxes which are not always-on I usually set the cache to less than one day. What you're suggesting says that I shouldn't do that. The cache-size has little to do with this, since you're either accept non-DDDT2-work, and if so you'll normally on average use 1 scheduler-request per task regardless of 0.01 days or 5 days cache-size. Or, your computer will sit idle since can't get enough DDDT2-work to keep it busy, and this again will to a large extent be regardless of cache-size, since DDDT2-work isn't added fast enough to fulfill the demand. Methinks you're going to run very short of "reliable" boxes if this is implemented. A minimum deferral of 1 minute or 10 minutes on connection-errors should have little or no effect on a computers reliable-rating... ![]() "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
||
|
gb077492
Advanced Cruncher Joined: Dec 24, 2004 Post Count: 96 Status: Offline |
A minimum deferral of 1 minute or 10 minutes on connection-errors should have little or no effect on a computers reliable-rating... True, but you're talking about a maximum deferral of 24 hours. There's the rub. I would be mad to run with a cache less than one day under those conditions. Since a cache of 2 days or more will mean that a machine is unlikely to be marked as reliable (effectively can't be, save for errors in the correction factor), that is why I made my statement. As to your other comments, I'm talking about set-and-forget users, not finger pokers. (But even these folks like to collect badges ![]() Mike [Edit 1 times, last edit by gb077492 at Jan 23, 2011 11:04:56 PM] |
||
|
deltavee
Ace Cruncher Texas Hill Country Joined: Nov 17, 2004 Post Count: 4891 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have started getting ts02_b's.
ts02_ b005_ pr56a0_ 1-- Does this mean we are 25% through this batch? |
||
|
My 2 cents worth
Cruncher Joined: Sep 12, 2008 Post Count: 19 Status: Offline |
I am getting my fair share
----------------------------------------![]() ![]() ![]() And by the way I just decided to solve my issue with the random WU jumping on the Linux box that I would only allow it to crunch a days worth at a time by date and suspend the rest until it has the others done. ![]() ![]() [Edit 1 times, last edit by My 2 cents worth at Jan 24, 2011 6:14:19 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Time for a new thread!!!!
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I have started getting ts02_b's. ts02_ b005_ pr56a0_ 1-- Does this mean we are 25% through this batch? Probably / could be that we're beyond 25% in work having been sent out, which by no means is a sign of completion [the client caches stockpiles is expected to be huge]. The last ''a'' was g50 that I can see in the message logs. Per seippel the ''d'' goes up to g22 (the names of the downloaded files contain this indicator). We had not processed everything from the previous rain when we started to receive this last set, so it's not accurate, but we've had 76,000 validate so far since the 18th , 18,781 yesterday per project stats page, and probably thousands are waiting on a matching wingman. On 542,682 (the number before quorum was 271,341) suggests we've got long ways to go. I think to complete something like a gold equivalent by the end of this set... 21 days have validated since start and 1.8 days worth are in PV jail, which then gets the total about half way emerald. --//-- |
||
|
|
![]() |