Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 7
|
![]() |
Author |
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
We have released version 7.06 of FightAIDS@Home on VINA. This release should fix the issues that have been reported about the application repeatedly restarting. You will automatically receive the new version for new tasks that are assigned to your devices.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Kevin,
1) Is this also addressing the reported Zombie processing i.e. BOINC loosing control of the science app, continuing to run [per task manager], when BOINC Manager says the task is suspended [and showing apparently time increment, but not sure of that part]. IIRC, reports say the science continues when the agent is exited. 2) Is this fix going to be ported to the other sciences as well running on VINA? There are the incidental reports of tasks reverting to start. That said, that's [dangerously] assuming the other sciences the other sciences use the same version of VINA. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Almost 24 hours later no answer, but remotely logging in, found my duo running FAHV 7.06 tasks in high priority with one 7.03 and a CEP2 job in "waiting to run". The 7.06 were running at 66.5% efficiency, which was odd for a device always running at 100% and crunching the only thing it does 95% of the time. The 7.03 task showed few seconds Elapsed and incrementing and 1:35 hours in CPU time. Checking into process manager, found the reason for the 66% ... the 7.03 task had gone rouge, the worker process having continued to run. Killed it since nothing done to the BM would actually suspend it. Aborting it via the BM did send it south though, no aborting via the process manager necessary. Maybe half a zombie, only just been bitten.
![]() The question thus stands and is confirmed: Is this zombie/orphan issue addressed with 7.06? Other issue, one of those 7.06's has a 7.03 wingman which is in PVal, and 2 others who erred with 'too many exits', presuming them to have suffered the looping condition. Any reason to think that validation might fail [potentially an inconclusive in waiting, a 3rd copy going out, likely 7.06 too, and the 7.03 given the invalid state]. Manure happens, but then flushing mostly gets rid of the stench, that is, only if the pipes are clean. ![]() As to the task, the log does not suggest it was making any progress at all, to top of the second is mine, showing this 1.39 hours CPU, 0.01 Elapsed. FAHV_ x3VQ4_ IN_ FBP_ 0048200_ 0726_ 3-- - In Progress 7/26/13 14:46:51 7/30/13 14:46:51 0.00 0.0 / 0.0 FAHV_ x3VQ4_ IN_ FBP_ 0048200_ 0726_ 2-- 703 Error 7/24/13 06:45:59 7/26/13 14:46:49 1.39 0.1 / 0.0 FAHV_ x3VQ4_ IN_ FBP_ 0048200_ 0726_ 1-- 703 Error 7/24/13 06:44:30 7/24/13 06:45:42 0.00 0.0 / 0.0 FAHV_ x3VQ4_ IN_ FBP_ 0048200_ 0726_ 0-- - In Progress 7/24/13 06:44:24 8/3/13 06:44:24 0.00 0.0 / 0.0 Result Log, of a WCG 7.0.67 test client [been on there for some time, which demonstrates there was no looping. Result Name: FAHV_ x3VQ4_ IN_ FBP_ 0048200_ 0726_ 2-- <core_client_version>7.0.67</core_client_version> <![CDATA[ <message> aborted by user </message> <stderr_txt> INFO: No state to restore. Start from the beginning. [14:32:23] Number of tasks = 5 [14:32:23] Starting task 0,CPU time is 0.000000. [14:32:24] ./ZINC12906422.pdbqt size = 30 10 ../../projects/www.worldcommunitygrid.org/fahv.x3VQ4_IN_FBP.pdbqt size = 2655 0 [15:12:52] Vina exited normal 0. [15:12:52] Finished task #0 cpu time used 1507.546875 [15:12:52] Starting task 1,CPU time is 1507.546875. [15:12:52] ./ZINC12906424.pdbqt size = 25 6 ../../projects/www.worldcommunitygrid.org/fahv.x3VQ4_IN_FBP.pdbqt size = 2655 0 [15:32:31] Vina exited normal 0. [15:32:31] Finished task #1 cpu time used 688.125000 [15:32:31] Starting task 2,CPU time is 2195.671875. [15:32:31] ./ZINC12906425.pdbqt size = 30 10 ../../projects/www.worldcommunitygrid.org/fahv.x3VQ4_IN_FBP.pdbqt size = 2655 0 [16:07:53] Vina exited normal 0. [16:07:53] Finished task #2 cpu time used 1392.796875 [16:07:53] Starting task 3,CPU time is 3588.468750. [16:07:53] ./ZINC12906428.pdbqt size = 32 7 ../../projects/www.worldcommunitygrid.org/fahv.x3VQ4_IN_FBP.pdbqt size = 2655 0 [16:41:55] Vina exited normal 0. [16:41:55] Finished task #3 cpu time used 1276.156250 [16:41:55] Starting task 4,CPU time is 4864.625000. [16:41:55] ./ZINC12906429.pdbqt size = 33 10 ../../projects/www.worldcommunitygrid.org/fahv.x3VQ4_IN_FBP.pdbqt size = 2655 0 </stderr_txt> ]]> |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Do we know when the 7.03's will stop being sent?
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Until a matching wingman is found, but knreed used some push to get the "lagging workunits" as he referred to [meaning me and many got a pile of 7.03 in PVal waiting on a 'good' wingman] processed via forcing copies going with 7.06. As I noted above, I hope this does not give a non-match result issue.
----------------------------------------[Edit 1 times, last edit by Former Member at Jul 26, 2013 5:30:10 PM] |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1321 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Not everytime reproducable, but once I suspended all 8 running VINA's together, did let crash my boinc.exe.
After BOINC restart the tasks picked up properly from the last checkpoint. |
||
|
NixChix
Veteran Cruncher United States Joined: Apr 29, 2007 Post Count: 1187 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Version 7.06 does seem to correct an issue I experienced with the graphics exiting immediately in version 7.03.
----------------------------------------Cheers ![]() ![]() |
||
|
|
![]() |