Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 41
Posts: 41   Pages: 5   [ Previous Page | 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 7427 times and has 40 replies Next Thread
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: I am desperate for one of these workunits

There are thousands of machines who said 'Yes' to AC@H and are able to process timely.


A number of the results in the unit I've got were received 3-4 days after they were sent out. Clearly, many machines are not processing these units within 24 hours.

Some people run with considerable buffer sizes. BOINC will manage the situation in that if a deadline is in jeopardy, it will put those AC@H e.g. on priority(did / does your job do so?). What is much more important is the run time it actually took. Normally rush jobs are only send to machines that are deemed reliable and are known entities with quick return, e.g. 16 hours, thus if your machine is not able to meet that requiremnt one rather needs to consider a bug in the distribution algorithm. Maybe knreed can comment on this.

@ Jean-David Beyer, BOINC is known to not properly establish upload / download speeds until it actually transmits a larger file (a manual tweak was posted in the Beta forum). Berkeley was asked to tune the algorithms for better recording of the throughputs.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Oct 16, 2007 8:03:01 PM]
[Oct 16, 2007 7:57:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: I am desperate for one of these workunits

Kremmen,

I will have to take a look at this. The mechanism that determines if a computer is 'reliable' checks the following:

1) Average turnaround time for a workunit is 24 hours
2) Max error rate is less then 1%

Error rate is a measurement of the recent history of valid results versus invalid results.

The goal here is to make sure that the workunit is assigned to a computer that will return the result quickly and correctly.


Once the computer has been determined to be reliable and it is matched with a workunit that needs a reliable computer to run it then the deadline for the workunit is set at 20% of the original deadline or twice the estimated wall clock time it will take for the workunit to complete. The estimated wall clock time is based on a function that uses the measurements of % time BOINC is running, % of time the client is on and estimated cpu duration (including it being modified by the duration correction factor).

This check for estimated wall clock time is intended to prevent the workunit from being assigned a unreasonable deadline on your computer.
[Oct 17, 2007 1:18:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: I am desperate for one of these workunits

@ Jean-David Beyer

What version of the BOINC client are you running? If it is not 5.10.22 then can you upgrade and let us know the results? It will take a couple of server contacts to update the download/upload values.

Here is the link for the download: https://secure.worldcommunitygrid.org/reg/ms/viewDownloadBoinc.do

thanks,
Kevin
[Oct 17, 2007 1:21:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: I am desperate for one of these workunits

Once the computer has been determined to be reliable and it is matched with a workunit that needs a reliable computer to run it then the deadline for the workunit is set at 20% of the original deadline or twice the estimated wall clock time it will take for the workunit to complete. ...
This check for estimated wall clock time is intended to prevent the workunit from being assigned a unreasonable deadline on your computer.


Well, it didn't work this time. The machine is extremely reliable and, when accepting work from WCG, usually does nothing else. At the outset, the estimate by boinc_cmd of how long the WU would take was 32 hours. However, there's something screwy about the estimates on this job, since that kept dropping rapidly and probably would have really been about 24 hours. (... had an unfortunate circumstance on the machine, whereby insufficient disk was available, caused the checkpoint file not to be able to be written and subsequent error. I've put another drive in it with 10GB just for WCG, so it won't run into contention again.)
[Oct 17, 2007 3:25:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: I am desperate for one of these workunits

Kremmen,

The value stored for the field on_frac for your computer has a value of 0.0 which is interpreted to mean that your computer is never on. Due to this value, when it estimated the wall clock time it would take your computer to return the workunit it was longer then the original deadline for the result. The code assumes in this case that there is a measurement in error (which in this case there is) and so it defaults to 20% of the original deadline so it therefore assigned a deadline of 1 day.

What version of BOINC are you running?

In the client_state.xml file what value do you show for the field

time_stats - on_frac?

thanks,
Kevin
[Oct 17, 2007 6:07:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: I am desperate for one of these workunits

What version of BOINC are you running?
In the client_state.xml file what value do you show for the field
time_stats - on_frac?


Right you are. on_frac is 0. However, active_frac=0.999849 and cpu_efficiency=0.971315 . Maybe the server could use active_frac when on_frac is somehow zero?

Looking at my backups, on_frac is always 0 on that machine. Every other machine has a sensible value for it. They are all running 5.8.15.

edit: last_update was an enormous number. I don't know how it got that way, but maybe that was part of the problem.
----------------------------------------
[Edit 1 times, last edit by Former Member at Oct 18, 2007 3:34:07 AM]
[Oct 17, 2007 7:13:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Jean-David Beyer
Senior Cruncher
USA
Joined: Oct 2, 2007
Post Count: 337
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: I am desperate for one of these workunits

"What version of the BOINC client are you running? If it is not 5.10.22 then can you upgrade and let us know the results? It will take a couple of server contacts to update the download/upload values."

5.8.16 i686-pc-linux-gnu, which is the latest that the BOINC server has to offer Linux systems.

Do I dare run your version of the client? Will it work for all the other applications?
Will it even work on my distribution (Red Hat Enterprise Linux 5 for *86 machines) currently running kernel 2.6.18-8.1.14.el5PAE and glibc-2.5-12?
----------------------------------------

[Oct 18, 2007 2:10:50 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Jean-David Beyer
Senior Cruncher
USA
Joined: Oct 2, 2007
Post Count: 337
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: I am desperate for one of these workunits

P.s., when I go to that web site
https://secure.worldcommunitygrid.org/reg/ms/viewDownloadBoinc.do
they only offer the version I already have for Linux.
----------------------------------------

[Oct 18, 2007 2:14:43 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: I am desperate for one of these workunits

Duplicate post: see below
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Nov 29, 2007 9:23:32 AM]
[Oct 18, 2007 8:45:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: I am desperate for one of these workunits

As of yet there is no 5.10.x recommended version by WCG nor one of Berkeley. If intend on at least being able to get AC@H work, if available, visit the tweak instruction to set one time the <bwdown>100000.000000</bwdown> value in the client_state.xml (prior exiting of BOINC required).
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 18, 2007 8:48:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 41   Pages: 5   [ Previous Page | 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread