Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 49
Posts: 49   Pages: 5   [ Previous Page | 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2666 times and has 48 replies Next Thread
keithhenry
Ace Cruncher
Senile old farts of the world ....uh.....uh..... nevermind
Joined: Nov 18, 2004
Post Count: 18665
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC 5.8.8 won't get new work???

Yes, it appears that the work fetch policy got a complete overhaul in 5.8. I think Ingleside is looking at a worst-case scenario though. If all the work in your queue was downloaded at the same time, then it would indeed all be N days old. Work in your queue should have some distribution of age over the value you have for your buffer. The differing TCs for the multiple WCG projects complicates that distribution as well. You could comple one FAAH and get three FCG or vice versa. Of course, it also assumes that the TC times you have are reasonably accurate. That has improved considerably at WCG in the last month or so. When raising your "Connect to server" value, you want to avoid drastic changes. Instead, increment it by say half a day at a time. If you go from 1 day to 3 days, you'll get 2 days worth of work at the same time all with the same deadline. Raising in increments will avoid the "deadline clumping". That shouldn't be an issue when lowering the value. At any rate, anyone with a non-default "connect to server" value should probably revisit that now as there's a good chance you have it set longer than is really needed. With the recently improved TC times, you shouldn't normally need a value beyond 3.

Given the apparent significant changes in 5.8.x, I've been playing some and BOINC Manager may be getting slightly senile. I've managed to end up with two WUs in Running status when in fact neither was (according to Task Manager) and this is NOT a multi-core machive biggrin ; I've seen it start a WU in Ready to Run status that has a later deadline than a WU with Waiting to Run status (nee Pre-empted). This all seems to get cleared up by recycling BOINC. This shouldn't normally be a concern for users unless you need to crunch a particular WU for some reason. I don't have a clear scenario of how to recreate this yet. If and when I do, I'll post about that.
----------------------------------------
Join/Website/IMODB



[Feb 8, 2007 10:04:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
cio_redulla
Advanced Cruncher
Philippines
Joined: Apr 24, 2006
Post Count: 130
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC 5.8.8 won't get new work???

For that Cio, you need to study the new Work Fetch Policy or WFP in depth. It's much smarter, but may still be needing some tweaks to not out-smart itself. Ingleside posted some logic why it gets worse the larger the buffer which has an in-build safety %.

http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=11582#86205

I've got it on 2.5 days which even in the worst scenario has never been that long on the WCG downtime front.



Thanks Sek! I'm using 5.4.11 for the mean time, smile. And until I fully grasp the idea behind the new WFP, I will not switch to new versions that BOINC will release.


Cio
----------------------------------------

[Feb 8, 2007 11:22:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
keithhenry
Ace Cruncher
Senile old farts of the world ....uh.....uh..... nevermind
Joined: Nov 18, 2004
Post Count: 18665
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC 5.8.8 won't get new work???

Ah, my apology to Ingleside. Their post over in http://www.worldcommunitygrid.org/forums/wcg/...582&offset=0#lastpost turned the light on for me. This is going to be an important factor to consider when setting your "Connect to server" value. You can set it higher but from a practical viewpoint, going beyond 3.0 won't really gain you anything. Given that you are starting off with an empty or minimal queue. Setting your "Connect to server" beyond 3.0 would in fact get you as much work as you asked for. However, it would be a good while before you ever got any more.

From Ingleside's other post, this formula is key:

Computational-deadline = report deadline - (cache-size + 1 day + "switch between projects every N hours")


Once you have any WU beyond this computational deadline, you will not get any more work until that WU is completed. WCG uses 7 days for the report deadline. Once your cache size, or "Connect to server" value goes beyond one-half the report deadline, the value of the computational deadline stops decreasing and begins increasing again. That is why "Connect to server" values beyond 3.0 actually hurt you in terms of dealing with an outage unless you've just recently increased it from a low value and have just filled up your queue with new work.

It would seem that for folks that crunch WCG only, we'll want to set the "switch between projects" value to either 0 or the smallest value it accepts near that. That will squeeze the last little drop out of this computational deadline with respect to maxing the outage we can handle. For folks that crunch WCG AND other projects, maxing your "connect to server" value to 3.0 for WCG could actually hurt you. Since any ONE WU that goes beyond the computational deadline will mean you won't get any new work until it is completed, it is that much more important that you crunch your oldest WU first.

Thanks Ingleside, I *think* I understand this better now. biggrin
----------------------------------------
Join/Website/IMODB



----------------------------------------
[Edit 1 times, last edit by keithhenry at Feb 8, 2007 11:48:10 PM]
[Feb 8, 2007 11:46:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC 5.8.8 won't get new work???

Yes, it appears that the work fetch policy got a complete overhaul in 5.8. I think Ingleside is looking at a worst-case scenario though. If all the work in your queue was downloaded at the same time, then it would indeed all be N days old. Work in your queue should have some distribution of age over the value you have for your buffer.

Well, you did see the formula, slightly re-worded from the original as posted by John McLeod VII, he's coded most of the client-side of cpu/work-scheduler starting with v4.35 so guess he knows that he's talking about. cool


Still, it can be useful to post what happens in older clients, and what happens as long as you're not using too large cache-size in v5.8.xx. smile


If you looks on v5.4.xx and earlier, runs 24/7, single-project, all parameters stabilized and all wu's takes exactly 5.5 hours to crunch, and deadline is 7 days, that happens?

Let's say cache-size = 5 days = 120 hours.

This means, you starts-out with 22 wu's = 121 hours cached.
After crunching 1 hour and 1 second, you've got:
119h59m59s work < 120 hours => asks for 1 second more work, and download 1 more wu bringing total upto 23 wu's cached, 125.5 hours cached.

After example a week, you'll basically have this scenario:
1; Crunches wu-31, and finishes it.
2; Crunches 1 hour on wu-32, downloads wu-54.
3; 4.5 hours later, finishes wu-32, wu-54 is now 4.5 hours old.
4; 1 hour later, wu-54 is 5.5 hours old, downloads wu-55.
5; 4.5 hours later, finishes wu-33, wu-54 is now... 10 hours old, wu-55 is 4.5 hours old.
6; 1 hour later, wu-54 is... 11 hours old, wu-55 is... 5.5 hours old, downloads wu-56.
...
46; wu-53 finishes, wu-54 is... 120 hours old, wu-55 is... 114.5 hours old, wu-56 is 109 hours old, wu-57 is 103.5 hours old, wu-58....

Meaning, by the time it's wu-54's time to be crunched, wu-54 is already 120 hours old, and there's only 2 days left to the deadline. This again means, if the deadline happens exactly when wu-54 starts to crunch, if the deadline lasts more than 2 days, wu-54 will be reported after it's deadline, and this generally means the crunching was wasted.


But, what about the rest of the wu's?
wu-55 has longer to deadline, so this wouldn't be a problem?

Unfortunately, wu-55 won't start before wu-54 is finished, meaning wu-55 will be 5.5 hours closer to it's deadline, and again meaning only 2 days away from it's deadline. This happens for all wu's, within a project BOINC-client normally crunches in 1st. in, 1st. out-mode, meaning even if you did just manage to download a "new" wu the second before project went down, by the time you reaches the "new" wu, you've crunched-through 5 days of the other wu's. And this again means, if outage has lasted for 5 days, 3 days crunching was a waste of cpu-resources.


And, if you're really unlucky, the outage happens 1 second before wu-54 finishes, meaning wu-54 has only 42.5 hours until it's deadline, and it can become worthless.

So, the "worse-case" is actually a little worse than my "1/2 deadline" suggested.



Oh, and as for decreasing switch-interwall, the lowest you can set is 1 minute. If you set it to zero, it will default-back to 1 hour.
----------------------------------------


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
----------------------------------------
[Edit 2 times, last edit by Ingleside at Feb 9, 2007 12:12:43 AM]
[Feb 9, 2007 12:10:21 AM]   Link   Report threatening or abusive post: please login first  Go to top 
keithhenry
Ace Cruncher
Senile old farts of the world ....uh.....uh..... nevermind
Joined: Nov 18, 2004
Post Count: 18665
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC 5.8.8 won't get new work???

Oh, and as for decreasing switch-interwall, the lowest you can set is 1 minute. If you set it to zero, it will default-back to 1 hour.


DRAT! I found that the profile will accept a zero value but wondered what "special" meaning that would have. Was over looking the in BOINC Wiki and couldn't find anything. Okay, so WCG-only crunchers can set this to 1 minute to max their queue for covering outages. Now I have to wonder about the performance impact of a 1 minute setting. Does BOINC "stop" every minute, look to see what other projects you have, see none and go back to crunching? Does it know that you only have one project attached and basically ignores this and doesn't actually look at whether it needs to switch projects or not? Do we have a tradeoff between performace and maxing our cache? I was sorta hoping that a zero value would be treated and an "off" switch and BOINC wouldn't switch projects regardless of whether you were attached to any others or not (hmmm....might that not be better that treating zero as equal to the default?).
----------------------------------------
Join/Website/IMODB



[Feb 9, 2007 12:23:11 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC 5.8.8 won't get new work???

DRAT! I found that the profile will accept a zero value but wondered what "special" meaning that would have.

Hmm, quick test, if you tries setting it to zero, it should be set at 1 minute. But, WCG isn't doing the normal checking as done in other BOINC-projects, so not sure what will happen here... and currently outage so...

Was over looking the in BOINC Wiki and couldn't find anything. Okay, so WCG-only crunchers can set this to 1 minute to max their queue for covering outages. Now I have to wonder about the performance impact of a 1 minute setting. Does BOINC "stop" every minute, look to see what other projects you have, see none and go back to crunching? Does it know that you only have one project attached and basically ignores this and doesn't actually look at whether it needs to switch projects or not? Do we have a tradeoff between performace and maxing our cache? I was sorta hoping that a zero value would be treated and an "off" switch and BOINC wouldn't switch projects regardless of whether you were attached to any others or not (hmmm....might that not be better that treating zero as equal to the default?).

For a single-project-cruncher there will AFAIK not be any performance-impact, or atleast negligible so wouldn't be a problem. If not mistaken, the client already re-calculates various things every minute, but won't normally switch so often.

Multi-project on the other hand it wouldn't be so good with so short switch-interwall, especially if don't leave in memory or you've got limited memory so 2 projects will constantly be swapping in and out of memory...

Oh, and as for zero, when the preference was introduced back around v4.10, it was possible to set to zero, and if not mistaken the default for old users was zero... This gave the effect was constantly switching projects, and BOINC-client using most of cpu-resources...
So, as bug-fix, if zero or less, it's AFAIK treaded as client-default 60 minutes.
----------------------------------------


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
[Feb 9, 2007 12:38:19 AM]   Link   Report threatening or abusive post: please login first  Go to top 
keithhenry
Ace Cruncher
Senile old farts of the world ....uh.....uh..... nevermind
Joined: Nov 18, 2004
Post Count: 18665
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC 5.8.8 won't get new work???

Well, I checked my profile after changing it to 0 and saving and it displayed 0. Sounds like for WCG-only crunchers (single project, 7 day report deadline), setting "Connect to server" to 3.0 and "switch applications" to 1 will maximize our buffer allowing us to maximize our ability to weather a server outage with minimal performance impact. (3.0 gives the best value - 72 hours less your switch interval - even if it's actually 90% of that that is used). Plus we NEVER want to allow any WU to go without being crunched on for more than three days or we will not receive any new work until it completes. Well, actually we don't want to allow it to not complete crunching before it has been in our queue for three days - even if it's actively crunching at the three day boundary, we won't get new work until it completes. OK, a multi-core machine may be a different story but it still sounds like a good "rule of thumb" to me.

Okay, NOW I really have a headache! wink
----------------------------------------
Join/Website/IMODB



[Feb 9, 2007 1:00:31 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC 5.8.8 won't get new work???

For the interested, a paper by Dr. Anderson and John McLeod VII on some hot BOINC topics. Local Scheduling for Volunteer Computing An excerpt:
5. Conclusions and future work
_We have described the issues involved in local scheduling for volunteer computing, and have presented policies that have proven to work well in the real world. Key aspects of these policies are: 1) the notion of debt, in its two forms; 2) the use of deadline scheduling (but only when necessary); and 3) careful attention to job completion estimation.
_Currently, client and server scheduling are not well integrated. The client asks for N seconds of work, and the server sends it jobs that can run in available memory, and that, if started immediately, would finish by their deadline. However, since the server has no information about work queued or in progress on the client, it can send jobs that will cause deadlines to be missed. To remedy this, we plan to have the client send information about queued and in-progress work, including completion time estimates. The server will use this information to do a deadline-scheduling based simulation to decide what jobs, if any, can safely be sent.
_Network connection interval is currently a user preference. Many users are unaware of this preference or don't set it correctly. We plan to eliminate it by having the BOINC client record statistics about periods when the host is powered off or not connected, and base scheduling decisions on these statistics.
_The local scheduling policies currently reflect memory constraints only after the fact. For example, if a host has 2 CPUs but only enough RAM to run one job at a time, BOINC will fetch work on the assumption that both CPUs will be used, and deadlines will be missed. This can be fixed – for example, by modifying the round-robin simulator to reflect memory-aware scheduling.
_We plan to develop a simulation-based framework in which we can evaluate, compare and study scheduling policies. Currently we rely on “thought experiments” and empirical evidence – we make a change to the scheduler, run it in-house and with alpha testers, then release it to the 400,000+ BOINC participants. It is difficult to know if a change has had the intended effect, and if a change causes a major problem, it can waste lots of computing power. A simulation-based testbed would avoid these problems.

A new acronym - to me - is introduced: PRRS (Potentially Runnable Resource Share) hypnotized

Enjoy
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Feb 9, 2007 11:34:10 AM]   Link   Report threatening or abusive post: please login first  Go to top 
watzkej
Cruncher
Joined: Mar 12, 2005
Post Count: 12
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC 5.8.8 won't get new work???


Given the apparent significant changes in 5.8.x, I've been playing some and BOINC Manager may be getting slightly senile. I've managed to end up with two WUs in Running status when in fact neither was (according to Task Manager) and this is NOT a multi-core machive biggrin ; I've seen it start a WU in Ready to Run status that has a later deadline than a WU with Waiting to Run status (nee Pre-empted). This all seems to get cleared up by recycling BOINC. This shouldn't normally be a concern for users unless you need to crunch a particular WU for some reason. I don't have a clear scenario of how to recreate this yet. If and when I do, I'll post about that.


No, you're not senile. Remember that 5.8.9 is still not "officially" released and it's in test. Just last night, I found a bug in the scheduler and David fixed it and got 5.8.11 out this morning. I don't want to send everyone off jumping up to 5.8.11 but I would say if you have problems with 5.8.8 keeping work and problems with 5.8.9 refusing to actually run work then you could *try* 5.8.11. But, it has *just* come out and there could be bugs. But, I guess you could go back to 5.4.11 as well.

- John Watzke
[Feb 9, 2007 4:51:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC 5.8.8 won't get new work???

.... or try this option by inserting it into the cc_config.xml
<work_request_factor>2<work_request_factor>

The amount of work requested from projects will by multiplied by this number. Use a number larger than one if your computer often runs out of work.

----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Feb 9, 2007 7:12:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 49   Pages: 5   [ Previous Page | 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread