Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 95
|
![]() |
Author |
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
That is correct additional information is needed from longer into the simulation.
Thanks, armstrdj |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
FAH2_ 9999983_ avx38789_ 000001_ 000007_ 039_ 1-- Microsoft Windows 10 Core x64 Edition, (10.00.15063.00) - In Progress 11/27/17 07:04:54 11/28/17 07:04:54 <-- resend FAH2_ 9999983_ avx38789_ 000001_ 000007_ 039_ 0-- Microsoft Windows 10 Core x86 Edition, (10.00.14393.00) 718 Valid 11/26/17 07:04:49 11/27/17 07:08:38 23.95 <-- mine That's relatively quick. Some of mine break through 40 hours and then there are some machines taking twice that. FAH2_ 9999986_ avx38789_ 000001_ 000405_ 047_ 2-- Linux 4.7.6-040706-generic 718 Valid 12/12/17 06:06:10 12/13/17 09:26:01 6.18 250.3 / 232.5 FAH2_ 9999986_ avx38789_ 000001_ 000405_ 047_ 1-- Linux 2.6.18-6-686 718 Valid 12/11/17 06:06:03 12/12/17 23:23:46 40.44 232.5 / 232.5 <-- mine FAH2_ 9999986_ avx38789_ 000001_ 000405_ 047_ 0-- Linux 4.8.0-59-generic 718 Valid 12/10/17 06:05:56 12/13/17 11:42:35 76.69 354.0 / 232.5 |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2166 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Result Name: FAH2_ 9999984_ avx38789_ 000001_ 000906_ 130_ 0-- Result Name OS AVN Status Sent Time Due / Return Time CPUh Claimed/Grant. [Edit 20 times, last edit by adriverhoef at Mar 2, 2018 1:22:21 AM] |
||
|
OldChap
Veteran Cruncher UK Joined: Jun 5, 2009 Post Count: 978 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Started running some low power Machines that take longer than 24h to complete Betas currently.
----------------------------------------Not an issue for normal work I think but with what I guess are resends only getting 24h..... My question then is should I abort or will the server learn not to send ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
We've had a few very brief power cuts this afternoon. After the most recent (at about 18:12) an in-progress and checkpointed beta unit seems to have restarted from the beginning, rather than the checkpoint which it apparently successfully used on each previous occasion. I looked at the WU properties as soon as I noticed and saw:
----------------------------------------FAH2_ 9999985_ avx38789_ 000001_ 000166_ 098_ 0 CPU time at last checkpoint: 05:21:42 CPU time: 05:24:10 Elapsed time: 05:28:00 Estimated time remaining: 92d 02:46:59 [!!!] Fraction done: 0.247% Progress rate: 6.840% per hour For now I'm letting it run. [Edited to add:] I just checked on this, and it went Invalid. The result log was as follows: <core_client_version>7.6.22</core_client_version> <![CDATA[ <stderr_txt> INFO: result number = 0 %IMPACT-I: Requested file to open for appending md.out Does not exist. Opening it as a new file. %IMPACT-I: Softcore binding energy with umax = 1000.00000 %IMPACT-I: Using AGBNP2: Analytical Generalized Born Model + Analytic Non-Polar Hydration Model %IMPACT-I: Hybrid potential for binding with lambda = 0.32354 agbnpf_assign_parameters(): info: attempting to load from SQL tables. [12:11:08] INFO: Checkpointed. Progress 1000 of 30000 steps complete CPU time 1649.578125 [12:38:11] INFO: Checkpointed. Progress 2000 of 30000 steps complete CPU time 3259.625000 [13:05:08] INFO: Checkpointed. Progress 3000 of 30000 steps complete CPU time 4862.687500 [13:32:26] INFO: Checkpointed. Progress 4000 of 30000 steps complete CPU time 6486.890625 [13:59:30] INFO: Checkpointed. Progress 5000 of 30000 steps complete CPU time 8095.765625 [14:26:29] INFO: Checkpointed. Progress 6000 of 30000 steps complete CPU time 9700.250000 [14:53:28] INFO: Checkpointed. Progress 7000 of 30000 steps complete CPU time 11304.578125 [15:20:33] INFO: Checkpointed. Progress 8000 of 30000 steps complete CPU time 12913.843750 [15:47:26] INFO: Checkpointed. Progress 9000 of 30000 steps complete CPU time 14511.156250 [16:14:10] INFO: Checkpointed. Progress 10000 of 30000 steps complete CPU time 16098.375000 [16:40:58] INFO: Checkpointed. Progress 11000 of 30000 steps complete CPU time 17689.546875 [17:08:08] INFO: Checkpointed. Progress 12000 of 30000 steps complete CPU time 19301.859375 INFO: result number = 0 %IMPACT-I: Softcore binding energy with umax = 1000.00000 %IMPACT-I: Using AGBNP2: Analytical Generalized Born Model + Analytic Non-Polar Hydration Model %IMPACT-I: Hybrid potential for binding with lambda = 0.32354 agbnpf_assign_parameters(): info: attempting to load from SQL tables. INFO: result number = 0 %IMPACT-I: Softcore binding energy with umax = 1000.00000 %IMPACT-I: Using AGBNP2: Analytical Generalized Born Model + Analytic Non-Polar Hydration Model %IMPACT-I: Hybrid potential for binding with lambda = 0.32354 agbnpf_assign_parameters(): info: attempting to load from SQL tables. INFO: result number = 0 %IMPACT-I: Softcore binding energy with umax = 1000.00000 %IMPACT-I: Using AGBNP2: Analytical Generalized Born Model + Analytic Non-Polar Hydration Model %IMPACT-I: Hybrid potential for binding with lambda = 0.32354 agbnpf_assign_parameters(): info: attempting to load from SQL tables. [18:43:27] INFO: Checkpointed. Progress 1000 of 30000 steps complete CPU time 20943.953750 [19:10:29] INFO: Checkpointed. Progress 2000 of 30000 steps complete CPU time 22534.610000 [19:37:23] INFO: Checkpointed. Progress 3000 of 30000 steps complete CPU time 24125.141250 [20:04:03] INFO: Checkpointed. Progress 4000 of 30000 steps complete CPU time 25700.625625 [20:30:43] INFO: Checkpointed. Progress 5000 of 30000 steps complete CPU time 27275.063125 [20:57:34] INFO: Checkpointed. Progress 6000 of 30000 steps complete CPU time 28866.844375 [21:24:15] INFO: Checkpointed. Progress 7000 of 30000 steps complete CPU time 30445.422500 [21:50:54] INFO: Checkpointed. Progress 8000 of 30000 steps complete CPU time 32015.781875 [22:17:44] INFO: Checkpointed. Progress 9000 of 30000 steps complete CPU time 33600.750625 [22:44:34] INFO: Checkpointed. Progress 10000 of 30000 steps complete CPU time 35183.016250 [23:11:48] INFO: Checkpointed. Progress 11000 of 30000 steps complete CPU time 36791.235000 [23:38:55] INFO: Checkpointed. Progress 12000 of 30000 steps complete CPU time 38392.469375 [00:05:41] INFO: Checkpointed. Progress 13000 of 30000 steps complete CPU time 39969.969375 [00:32:24] INFO: Checkpointed. Progress 14000 of 30000 steps complete CPU time 41549.844375 [00:59:03] INFO: Checkpointed. Progress 15000 of 30000 steps complete CPU time 43122.516250 [01:25:48] INFO: Checkpointed. Progress 16000 of 30000 steps complete CPU time 44701.516250 [01:54:38] INFO: Checkpointed. Progress 17000 of 30000 steps complete CPU time 46299.813125 [02:21:50] INFO: Checkpointed. Progress 18000 of 30000 steps complete CPU time 47904.625625 [02:49:11] INFO: Checkpointed. Progress 19000 of 30000 steps complete CPU time 49518.047500 [03:16:37] INFO: Checkpointed. Progress 20000 of 30000 steps complete CPU time 51136.750625 [03:43:51] INFO: Checkpointed. Progress 21000 of 30000 steps complete CPU time 52742.516250 [04:11:02] INFO: Checkpointed. Progress 22000 of 30000 steps complete CPU time 54345.750625 [04:38:30] INFO: Checkpointed. Progress 23000 of 30000 steps complete CPU time 55963.531875 [05:05:44] INFO: Checkpointed. Progress 24000 of 30000 steps complete CPU time 57566.735000 [05:32:58] INFO: Checkpointed. Progress 25000 of 30000 steps complete CPU time 59172.485000 [06:00:07] INFO: Checkpointed. Progress 26000 of 30000 steps complete CPU time 60775.360000 [06:27:30] INFO: Checkpointed. Progress 27000 of 30000 steps complete CPU time 62388.828750 [06:54:38] INFO: Checkpointed. Progress 28000 of 30000 steps complete CPU time 63988.906875 [07:21:53] INFO: Checkpointed. Progress 29000 of 30000 steps complete CPU time 65594.516250 [07:49:12] INFO: Checkpointed. Progress 30000 of 30000 steps complete CPU time 67204.172500 %IMPACT-I: Species 1 written to SQL file md-out1.dms %IMPACT-I: Species 2 written to SQL file md-out2.dms 07:49:14 (2308): called boinc_finish(0) </stderr_txt> ]]> You can see that it restarted three times, though without time stamps it's impossible to see how far it might have got each time. It's also impossible to tell why the last time was different, or in what way it was considered invalid. I'm not sure it's worth spending any time over. I'll put it down as JOOTT. [Edit 1 times, last edit by Former Member at Jan 10, 2018 10:48:54 AM] |
||
|
NixChix
Veteran Cruncher United States Joined: Apr 29, 2007 Post Count: 1187 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I seem to get FAHB beta work only when I have my FAHB-enabled profile selected, even though all profiles are enabled for beta work.
----------------------------------------Cheers ![]() ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I suspect some other factors are causing that effect, like when a machine on that profile happens to request work. For the time being, you could try a profile that has just FAAH selected (along with Beta enabled) and with a slightly larger cache than usual. If you run that for most of a day, say, and only occasionally load up with another profile with a smaller cache, you should obtain more Beta units
![]() |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1322 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I seem to get FAHB beta work only when I have my FAHB-enabled profile selected, even though all profiles are enabled for beta work. Cheers ![]() Beta's are sent with every profile when requesting work at the moment beta's are in the queue, but the chance to get beta's with only FAH2 selected is much bigger. When you have a 1 day buffer with only FAH2's processing, your buffer still needs more work, cause you only get as many FAH2's as you have cores and not more. |
||
|
Seoulpowergrid
Veteran Cruncher Joined: Apr 12, 2013 Post Count: 817 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I opened the BOINC software and saw the WU
----------------------------------------FAH2_ 9999985_ avx38789_ 000001_ 000105_ 124_ 0-- was near the end of calculations, maybe 90% or such, I'm not sure as I saw the completed percentage just back to around 60% with the Remaining Time showing --- which means the file is basically done. Another five seconds later it said the file was completed, uploaded, and Results Status page shows it as a valid WU. As the file said it is valid I don't have a reason to doubt it, but this "twitch" near the end is something I am not used to. Edit: Unsure if it is important, but the CPU time/Elapsed time is 3.70 / 5.82. The same machine also crunched WU FAH2_ 9999985_ avx38789_ 000001_ 001000_ 139_ 0-- and the CPU time/Elapsed time was 3.72 / 4.85. All other Beta WU's CPU/elapsed time are basically 1:1 but these two are exceptions. ![]() [Edit 1 times, last edit by Seoulpowergrid at Jan 30, 2018 8:06:14 AM] |
||
|
NixChix
Veteran Cruncher United States Joined: Apr 29, 2007 Post Count: 1187 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I seem to get FAHB beta work only when I have my FAHB-enabled profile selected, even though all profiles are enabled for beta work. Cheers ![]() Beta's are sent with every profile when requesting work at the moment beta's are in the queue, but the chance to get beta's with only FAH2 selected is much bigger. When you have a 1 day buffer with only FAH2's processing, your buffer still needs more work, cause you only get as many FAH2's as you have cores and not more. If anyone is getting FAHB beta without having FAHB selected please post a reply. Cheers ![]() ![]() |
||
|
|
![]() |