Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 129
|
![]() |
Author |
|
dpfender
Cruncher Joined: Dec 30, 2004 Post Count: 3 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I would be glad to provide info about the stuck HPF2 wu (0% after 40+ hours, multiple times) but I see no way to determine what the job info is. In the i screen, the graphic image is rotating, but the score values are all ----- values and the progress is 0.0%.
I have a Windows XP Pro system with AMD Athlon 64 X2 4200+, 3.5 GB RAM that runs continuously (power is never off) and is always connected to the internet. The CPU time for the wcg_hpf2_rosetta.exe process is 40:09:20 with 66,188K memory usage and 3 threads. In the upper right corner is v5.0.5.3 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
ud agent version 3.0 (2844)
device id: 326952 work unit: ud_7816614.exe ? been stuck at 0 for 139 hrs ![]() wcg_hpf2_rosetta using 98% cpu and 108k mem ? ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello dpfender,
I would be glad to provide info about the stuck HPF2 wu (0% after 40+ hours, multiple times) but I see no way to determine what the job info is. It does seem harder with the UD client than with the BOINC client, doesn't it? But all you have to do is report the problem, give your Device ID, and report the time at which your device downloaded the problem work unit. This information is available at My Grid - Device Manager - Device Statistics, which shows the last time you uploaded a result from that device -- which is also the time you downloaded the problem work unit. This is all the information that knreed needs to identify the work unit. Then terminate the process and draw a new work unit. Lawrence |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Well, there is an Overnite Sensation to be reported on the first of the 2 that had already accumulated 4 errors, with 2 open, Not Really.
----------------------------------------Following for posterity: - Device ID: tpdc-mxirnto2ft (a.k.a. 34409 per BOINCview) - WU: za086_ 00086: - Held after 1:48:37 CPU time, after the point where it skipped to 25.000% - the 9th segment, which were 2.7778% according to BOINC. - It tried a few more times, then reported 'Calculation error' and subsequently skipped to 100% and send it off - The log produced in CET, summer times: 06-07-16 15:00:17|World Community Grid|Unrecoverable error for result za086_00086_4 ( - exit code -1073741819 (0xc0000005)) 06-07-16 15:00:17|World Community Grid|Deferring scheduler requests for 1 minutes and 0 seconds 06-07-16 15:00:17|World Community Grid|Computation for task za086_00086_4 finished 06-07-16 15:00:18|World Community Grid|Starting task za115_00833_0 using hpf2 version 507 06-07-16 15:01:21|World Community Grid|Sending scheduler request: To fetch work 06-07-16 15:01:21|World Community Grid|Requesting 5916 seconds of new work, and reporting 1 completed tasks Very strange was that opposed to the previous 40 HPF2 this computer did, this one alerted the firewall, hence froze the situation allowing the recording of some data otherwise escaping to oblivion. BOINC was trying to contact Remote Point: 207.46.248.241, port http [80] and also 2 more IPs that had something like ssl.berkeley....... on it. Suggests the calculation error does go where its not supposed to go setting of alarmbells for otherwise 'approved' contact events. After this, looking at the results status page, the now common view was presented: za086_ 00086 tpdc-mxirnto2ft Error 07/14/2006 13:21:03 07/16/2006 12:59:20 1.81 15 / 0 za086_ 00086 Error 07/15/2006 00:03:08 07/15/2006 08:12:03 2.40 13 / 0 za086_ 00086 Error 07/14/2006 13:21:03 07/16/2006 12:59:20 1.81 15 / 0 za086_ 00086 In Progress 07/14/2006 10:23:16 07/21/2006 10:23:16 0.00 0 / 0 za086_ 00086 Error 07/14/2006 08:06:46 07/15/2006 00:00:02 2.20 24 / 0 za086_ 00086 Error 07/14/2006 08:04:57 07/14/2006 10:19:05 1.55 16 / 0 za086_ 00086 Error 07/14/2006 08:03:22 07/14/2006 13:02:22 2.24 14 / 0 The familiar (Long) Result log had no exceptions to a prevous log in an other thread. Think i'm going to hit the abort on the other one, as the outcome is pretty certain...RickH...no surprises ![]()
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi Sekerob,
Yes, my McAfee firewall stopped an error report from Rosetta a couple of days ago. It surprised me. I allowed it through, after a little thought. I think it was back in May that Rosetta@home added some new error reporting logic to their version of Rosetta. We may be doing something similar now. I have not been told. All that I know is that the staff are putting in time debugging. Lawrence |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I have been getting the same results as others. does that mean 48 hours and 31 hours?It has run for 48:00+ and still 0%. Tried a reboot and now at 31:00+ still the same 0%. Agent Version 3.0 (2844) Device ID 209699 Any thoughts.? Yes 48:00 and 31:00 means Hours. |
||
|
dpfender
Cruncher Joined: Dec 30, 2004 Post Count: 3 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My device (ID 67294) last returned a result at 07/11/2006 12:47:33.
I have terminated the process several times since then, both with the Exit menu choice and with the Task Manager. It always seems to be in the "stuck mode". How do I force a new work unit? There seems to be no option for doing this. |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Hi dpfender.....the proper killing has been covered many times in this thread and elsewhere which will force the retrieval of a new Work Unit (if your machine still meets the minimum specifications). To quote:
----------------------------------------Hello Mark099, Right click at the bottom of your screen, select Task Manager, then select WCGrid_Rosetta in the processes, then Kill it. Lawrence
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Another one to debug:
I've been running 147 hours 22 mins. Last result returned 7-11-2006. 0.0% complete. |
||
|
davidhobbs
Senior Cruncher England Joined: Dec 30, 2004 Post Count: 151 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Perhaps it would be a good idea for the precise method to be featured on the Start Here - FAQs forum.
I'm sure some users will need step by step details including, for example, clicking on the Processes tab after firing up task manager, and ignoring the warning prompt when killing the process. David. |
||
|
|
![]() |