Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 14
|
![]() |
Author |
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2166 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Bryn Mawr: So both machines are now downloading and successfully running tasks in just over 11 hours. However, about 15-20% of the tasks fail at 7.88 hours with time limit exceeded. Is there any reason why some tasks have a shorter time limit - more importantly, is there any way of predicting which jobs will be affected? Read through this thread and you might be able to understand and correct the problem on your machine. Adri No, purely based on the runtime. I'm sorry, Bryn, then I must have misunderstood your problem. Anyway, the runtime relies on the number of jobs inside each OPNG-task and most of the time that number revolves around the number 80: OPNG_0176011_00090, jobs: 81 OPNG_0175999_00121, jobs: 81 Easy to see this for yourself after checking the variables are OK, although: # https://boinc.berkeley.edu/wiki/client_state # "You shouldn't rely on the client_state.xml format staying unchanged between BOINC versions. # If you're writing a program or script that needs to get information from BOINC, use GUI RPCs instead." # Files and directories used by BOINC: Adri [Edit 1 times, last edit by adriverhoef at Apr 22, 2023 12:27:30 PM] |
||
|
Bryn Mawr
Senior Cruncher Joined: Dec 26, 2018 Post Count: 345 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Bryn Mawr: So both machines are now downloading and successfully running tasks in just over 11 hours. However, about 15-20% of the tasks fail at 7.88 hours with time limit exceeded. Is there any reason why some tasks have a shorter time limit - more importantly, is there any way of predicting which jobs will be affected? Read through this thread and you might be able to understand and correct the problem on your machine. Adri No, purely based on the runtime. I'm sorry, Bryn, then I must have misunderstood your problem. Anyway, the runtime relies on the number of jobs inside each OPNG-task and most of the time that number revolves around the number 80: OPNG_0176011_00090, jobs: 81 OPNG_0175999_00121, jobs: 81 Easy to see this for yourself after checking the variables are OK, although: # https://boinc.berkeley.edu/wiki/client_state # "You shouldn't rely on the client_state.xml format staying unchanged between BOINC versions. # If you're writing a program or script that needs to get information from BOINC, use GUI RPCs instead." # Files and directories used by BOINC: Adri Sorry, I was on a 3 hour bus journey round the twisties and trying to read through the thread you linked hit my stomach. I’m now at destination, fed and watered. Basically my problem is that, whilst 80+% of the tasks will happily run to completion in 11+ hours the other 15-20% of tasks abort with time limit exceeded after 7.88 hours. I want to be able to identify the tasks with short time limits to either kill them early or adjust the time limit to 12 hours+ It’s not the length of time the job will run, that’s fairly constant, it’s the lime limit that Boinc sets before it aborts the task. [Edit 1 times, last edit by Bryn Mawr at Apr 22, 2023 6:49:10 PM] |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2166 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Bryn,
Are you comfortable using a script to circumvent the problem? This is what I do: First I'm determining what the value is of <rsc_fpops_est> for OPNG-tasks in BOINC's client_state.xml file. I'm wondering if this is the same for everyone. For me, it's 25320265880000 (or 25320265880000.000000, if you must). It seems to be a fairly constant value, because it hasn't changed in the past half year. Then I run a script from crontab at some chosen intervals to change a few values in client_state.xml. This can only work if BOINC is temporarily stopped from running. As soon as the change has been made, BOINC can be resumed. The script does this for you. Please find my script below. Put it in a file on your machine so that you can execute it. The script should be run from root's crontab. BOINCDIR=~boinc; DIR=/var/lib/boinc; [ -d $DIR ] && BOINCDIR=$DIR See if this solves your problem, else experiment with the multiplication factor (its current value is 4, see above). Good luck! Adri |
||
|
Bryn Mawr
Senior Cruncher Joined: Dec 26, 2018 Post Count: 345 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Bryn, Are you comfortable using a script to circumvent the problem? This is what I do: First I'm determining what the value is of <rsc_fpops_est> for OPNG-tasks in BOINC's client_state.xml file. I'm wondering if this is the same for everyone. For me, it's 25320265880000 (or 25320265880000.000000, if you must). It seems to be a fairly constant value, because it hasn't changed in the past half year. Then I run a script from crontab at some chosen intervals to change a few values in client_state.xml. This can only work if BOINC is temporarily stopped from running. As soon as the change has been made, BOINC can be resumed. The script does this for you. Please find my script below. Put it in a file on your machine so that you can execute it. The script should be run from root's crontab. BOINCDIR=~boinc; DIR=/var/lib/boinc; [ -d $DIR ] && BOINCDIR=$DIR See if this solves your problem, else experiment with the multiplication factor (its current value is 4, see above). Good luck! Adri Many thanks, I’ll try this as soon as I’m back in front of my computers. |
||
|
|
![]() |