Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 234
|
![]() |
Author |
|
verheyde
Cruncher Belgium Joined: Dec 7, 2004 Post Count: 25 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
There are still a series of older work units floating around. I just "finished"
FAHV_ x3ZCM_ A_ IN_ Y3b_ rig_ 0226151_ 0030_ 4-- and FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0226032_ 0058_ 3-- with "Maximum elapsed time exceeded" after > 22 hours of CPU. (on a Win7 / I7 laptop) There is one task still running, which I guess will error out shortly (FAHV_x3ZSO_B_IN_Y3a_rig_0227274_0054_3--) and I aborted one _5-- for which the wingmen already had errored out on elapsed time. Purging those tasks from the system is taking a long time. I seem to remeber that about 10 days ago one of the admins announced they had fixed the issue for new WUs. |
||
|
andgra
Senior Cruncher Sweden Joined: Mar 15, 2014 Post Count: 184 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi,
----------------------------------------We are experiencing the same thing. WUs who goes forever and very low points. They need to look into this! Several of my team members have unticked the FighAIDS@Home project due to this. Response from somone responsible please! ![]()
/andgra
----------------------------------------![]() [Edit 1 times, last edit by andgra at Oct 1, 2014 12:56:09 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0226025_ 0069_ 5-- - In Progress 9/29/14 19:02:27 10/3/14 07:02:27 0.00 0.0 / 0.0
FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0226025_ 0069_ 4-- 732 Valid 9/27/14 13:26:05 9/30/14 03:34:56 24.99 66.2 / 66.2 FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0226025_ 0069_ 3-- 732 Valid 9/24/14 01:25:05 10/1/14 06:10:13 36.74 66.2 / 66.2 FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0226025_ 0069_ 2-- 732 Error 9/21/14 13:13:45 9/24/14 01:24:48 20.25 232.9 / 232.9 FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0226025_ 0069_ 1-- 732 Error 9/19/14 19:03:10 9/21/14 13:12:22 30.35 66.2 / 66.2 FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0226025_ 0069_ 0-- - No Reply 9/19/14 19:02:03 9/29/14 19:02:03 0.00 0.0 / 0.0 The top one in on my node, which is bufferless, the moment a task arrives, it starts. 1d:18h:09m:16s it has clocked and in between the 'no reply' still came in, of course it did. Henceforth, deleting anything that's having a suffix of _3 or above. Categorically not interested in spending computing time on superfluous work. Commencing now, aborted the top one, not interested in credit for nothing, in fact just threw the 'no new work' switch as electricity is too expensive here to compute for nothing. The top one now looks like this: FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0226025_ 0069_ 5-- 732 User Aborted 9/29/14 19:02:27 10/1/14 13:19:49 41.68 82.1 / 0.0 FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0226025_ 0069_ 4-- 732 Valid 9/27/14 13:26:05 9/30/14 03:34:56 24.99 66.2 / 66.2 FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0226025_ 0069_ 3-- 732 Valid 9/24/14 01:25:05 10/1/14 06:10:13 36.74 66.2 / 66.2 FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0226025_ 0069_ 2-- 732 Error 9/21/14 13:13:45 9/24/14 01:24:48 20.25 232.9 / 232.9 FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0226025_ 0069_ 1-- 732 Error 9/19/14 19:03:10 9/21/14 13:12:22 30.35 66.2 / 66.2 FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0226025_ 0069_ 0-- - No Reply 9/19/14 19:02:03 9/29/14 19:02:03 0.00 0.0 / 0.0 The even more preposterous, the one going in error got 232.9 credit and the once succeeding got 66.2. The joke is clearly on the volunteer! |
||
|
deltavee
Ace Cruncher Texas Hill Country Joined: Nov 17, 2004 Post Count: 4891 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My wingmen and I have 448.78 hours rt invested in this wu so far. Maybe _5 can complete it in less than 4 days.
![]() FAHV_ x3VQ8_ B_ IN_ LEDGFa_ rig_ 0221725_ 0034_ 5-- - In Progress 9/29/14 15:52:37 10/3/14 03:52:37 0.00 0.0 / 0.0 FAHV_ x3VQ8_ B_ IN_ LEDGFa_ rig_ 0221725_ 0034_ 4-- 732 Pending Verification 9/19/14 15:52:14 9/25/14 21:45:05 45.43 64.8 / 0.0 FAHV_ x3VQ8_ B_ IN_ LEDGFa_ rig_ 0221725_ 0034_ 3-- - No Reply 9/19/14 15:52:13 9/29/14 15:52:13 0.00 0.0 / 0.0 FAHV_ x3VQ8_ B_ IN_ LEDGFa_ rig_ 0221725_ 0034_ 2-- 716 Pending Verification 9/15/14 07:54:27 9/27/14 08:14:48 118.47 119.9 / 0.0 FAHV_ x3VQ8_ B_ IN_ LEDGFa_ rig_ 0221725_ 0034_ 1-- 716 Pending Verification 9/15/14 07:53:04 9/27/14 02:52:46 104.08 117.4 / 0.0 FAHV_ x3VQ8_ B_ IN_ LEDGFa_ rig_ 0221725_ 0034_ 0-- 716 Pending Verification 9/5/14 07:52:08 9/19/14 12:46:01 180.80 227.7 / 0.0<-me |
||
|
Eric_Kaiser
Veteran Cruncher Germany (Hessen) Joined: May 7, 2013 Post Count: 1047 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
@lavaflow: Speaking of credits for workunits I recommend the FAHV more than cep2 and mcm1.
----------------------------------------When I look on the credits granted on these 3 projects I can see a huge difference: odroid u3 with 1,7 GHz: FAHV_ x2Q3K-A-AS_ 0880459_ 2037_ 0-- Valid 4.11 / 4.12 225.7 / 225.7 Athlon 5350 with 2,05 GHz: E225591_ 905_ S.266.C20H8N6S6.RCHFHLODPRWINB-UHFFFAOYSA-N.7_ s1_ 14_ 0-- Valid 7.76 / 8.00 105.3 / 105.3 i7-3930k with 3,5 GHz: MCM1_ 0007898_ 2716_ 0-- Valid 4.15 / 4.15 113.9 / 103.9 But what you've mentioned is true. How can it be that one gets more credit for a errored wu than one who has valids. I think some of the problems are gone once the long runnings are completely done. So I don't bother about credits. I need runtime for badges... ![]() ![]() |
||
|
Teglen
Cruncher Joined: Nov 26, 2011 Post Count: 7 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I just had some WUs with an estimated runtime of 175h...My computer would have to run 24/7 to get them done in time so I aborted them all and quit FAAH for now until that is resolved. My computing time and electricity are too valuable to be wasted like that.
I'm now focusing on mapping cancer markers. ![]() |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7697 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have not seen any of the long running WU for several days now, so there can't be too many of them floating around yet. The current FAHV WU's I'm seeing are all completing in less than hour sometimes with PPH in excess of 100.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
acpartsman
Veteran Cruncher Martinsville VA, USA Joined: May 6, 2007 Post Count: 943 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm still receiving a lot of the long running WUs. In fact they are so long they do not finish by the deadline. Some are taking a week to run on a 3.2ghz machine with 8gb of ram. Most take at least 4 days. This is ridiculous. Burning a lot of power and not getting any credit for my work. And don't even get me started on what it does to my phone.
----------------------------------------![]() They either need to fix their issue or suspend this project completely till someone can figure who changed what.
One drop raises the sea.
![]() |
||
|
nittany85
Cruncher United States of America Joined: Apr 29, 2007 Post Count: 17 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I am willing to let my machines be part of the "clean up crew" to get these long work units out of the system but I need to monitor all of the machines to make sure that I start them as soon as I receive them in order to try to complete them before the end time. In most cases there is not enough time given to be able to complete these jobs as it looks like the remaining jobs have had the most problems in getting enough valid results. I just received a job today that had a valid result on September 15th and my machine is the sixth or seventh user trying to get the second valid result.
I thought we would have seen the last of these by now, but I think there are some very difficult jobs out there that may take awhile to clear. Is there any way to identfy the jobs that have gone through multiple tries without the required valid results so that they can be removed from the system? |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I have had several of the FAHV WUs which run around 90 days. The current one in progress is FAHV_ x3ZCM_ A_ IN_ Y3a_ rig_ 0225957_ 0007_ 2-- and the last one to complete took 96 hours (elapsed time). It looks like the amount of CPU time needed to complete these tasks is running beyond the deadline for completion, but I let each task run to completion instead of doing an abort.
----------------------------------------[Edit 1 times, last edit by Former Member at Oct 2, 2014 3:26:33 PM] |
||
|
|
![]() |