World Community Grid - View Thread - A few unusual HPF2 work units

World Community Grid Forums

Category: Completed Research

Forum: Human Proteome Folding - Phase 2

Thread: A few unusual HPF2 work units

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 129

[ ]

Author

This topic has been viewed 8658 times and has 128 replies

dpfender
Cruncher
Joined: Dec 30, 2004
Post Count: 3
Status: Offline
Project Badges:

180 day badge for Human Proteome Folding

180 day badge for Human Proteome Folding - Phase 2

1 year badge for Nutritious Rice for the World

180 day badge for Help Fight Childhood Cancer

1 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for The Clean Energy Project - Phase 2

14 day badge for Computing for Clean Water

180 day badge for Drug Search for Leishmaniasis

180 day badge for GO Fight Against Malaria

50 year badge for Mapping Cancer Markers

180 day badge for Uncovering Genome Mysteries

2 year badge for FightAIDS@Home - Phase 2

5 year badge for Microbiome Immunity Project

90 day badge for Africa Rainfall Project

5 year badge for OpenPandemics - COVID-19


Re: A few unusual HPF2 work units

I would be glad to provide info about the stuck HPF2 wu (0% after 40+ hours, multiple times) but I see no way to determine what the job info is. In the i screen, the graphic image is rotating, but the score values are all ----- values and the progress is 0.0%.

I have a Windows XP Pro system with AMD Athlon 64 X2 4200+, 3.5 GB RAM that runs continuously (power is never off) and is always connected to the internet. The CPU time for the wcg_hpf2_rosetta.exe process is 40:09:20 with 66,188K memory usage and 3 threads.

In the upper right corner is v5.0.5.3

[Jul 15, 2006 10:42:41 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: A few unusual HPF2 work units

ud agent version 3.0 (2844)
device id: 326952
work unit: ud_7816614.exe ?
been stuck at 0 for 139 hrs crying

wcg_hpf2_rosetta using 98% cpu and 108k mem ? confused

[Jul 16, 2006 3:33:20 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: A few unusual HPF2 work units

Hello dpfender,

I would be glad to provide info about the stuck HPF2 wu (0% after 40+ hours, multiple times) but I see no way to determine what the job info is.

It does seem harder with the UD client than with the BOINC client, doesn't it? But all you have to do is report the problem, give your Device ID, and report the time at which your device downloaded the problem work unit. This information is available at My Grid - Device Manager - Device Statistics, which shows the last time you uploaded a result from that device -- which is also the time you downloaded the problem work unit. This is all the information that knreed needs to identify the work unit.

Then terminate the process and draw a new work unit.

Lawrence

[Jul 16, 2006 3:52:22 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: A few unusual HPF2 work units

Well, there is an Overnite Sensation to be reported on the first of the 2 that had already accumulated 4 errors, with 2 open, Not Really.

Following for posterity:

- Device ID: tpdc-mxirnto2ft (a.k.a. 34409 per BOINCview)
- WU: za086_ 00086:
- Held after 1:48:37 CPU time, after the point where it skipped to 25.000% - the 9th segment, which were 2.7778% according to BOINC.
- It tried a few more times, then reported 'Calculation error' and subsequently skipped to 100% and send it off
- The log produced in CET, summer times:

06-07-16 15:00:17|World Community Grid|Unrecoverable error for result za086_00086_4 ( - exit code -1073741819 (0xc0000005))
06-07-16 15:00:17|World Community Grid|Deferring scheduler requests for 1 minutes and 0 seconds
06-07-16 15:00:17|World Community Grid|Computation for task za086_00086_4 finished
06-07-16 15:00:18|World Community Grid|Starting task za115_00833_0 using hpf2 version 507
06-07-16 15:01:21|World Community Grid|Sending scheduler request: To fetch work
06-07-16 15:01:21|World Community Grid|Requesting 5916 seconds of new work, and reporting 1 completed tasks

Very strange was that opposed to the previous 40 HPF2 this computer did, this one alerted the firewall, hence froze the situation allowing the recording of some data otherwise escaping to oblivion. BOINC was trying to contact Remote Point: 207.46.248.241, port http [80] and also 2 more IPs that had something like ssl.berkeley....... on it. Suggests the calculation error does go where its not supposed to go setting of alarmbells for otherwise 'approved' contact events.

After this, looking at the results status page, the now common view was presented:

za086_ 00086 tpdc-mxirnto2ft Error 07/14/2006 13:21:03 07/16/2006 12:59:20 1.81 15 / 0

za086_ 00086 Error 07/15/2006 00:03:08 07/15/2006 08:12:03 2.40 13 / 0
za086_ 00086 Error 07/14/2006 13:21:03 07/16/2006 12:59:20 1.81 15 / 0
za086_ 00086 In Progress 07/14/2006 10:23:16 07/21/2006 10:23:16 0.00 0 / 0
za086_ 00086 Error 07/14/2006 08:06:46 07/15/2006 00:00:02 2.20 24 / 0
za086_ 00086 Error 07/14/2006 08:04:57 07/14/2006 10:19:05 1.55 16 / 0
za086_ 00086 Error 07/14/2006 08:03:22 07/14/2006 13:02:22 2.24 14 / 0

The familiar (Long) Result log had no exceptions to a prevous log in an other thread.

Think i'm going to hit the abort on the other one, as the outcome is pretty certain...RickH...no surprises applause

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Jul 16, 2006 4:11:38 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: A few unusual HPF2 work units

Hi Sekerob,
Yes, my McAfee firewall stopped an error report from Rosetta a couple of days ago. It surprised me. I allowed it through, after a little thought. I think it was back in May that Rosetta@home added some new error reporting logic to their version of Rosetta. We may be doing something similar now. I have not been told. All that I know is that the staff are putting in time debugging.

Lawrence

[Jul 16, 2006 4:42:10 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: A few unusual HPF2 work units

I have been getting the same results as others.
It has run for 48:00+ and still 0%. Tried a reboot and now at 31:00+ still the same 0%.
Agent Version 3.0 (2844)
Device ID 209699
Any thoughts.?

does that mean 48 hours and 31 hours?

Yes 48:00 and 31:00 means Hours.

[Jul 16, 2006 7:49:09 PM]

dpfender
Cruncher
Joined: Dec 30, 2004
Post Count: 3
Status: Offline
Project Badges:


Re: A few unusual HPF2 work units

My device (ID 67294) last returned a result at 07/11/2006 12:47:33.
I have terminated the process several times since then, both with the Exit menu choice and with the Task Manager. It always seems to be in the "stuck mode". How do I force a new work unit? There seems to be no option for doing this.

[Jul 17, 2006 1:22:17 PM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: A few unusual HPF2 work units

Hi dpfender.....the proper killing has been covered many times in this thread and elsewhere which will force the retrieval of a new Work Unit (if your machine still meets the minimum specifications). To quote:

Hello Mark099,
Right click at the bottom of your screen, select Task Manager, then select WCGrid_Rosetta in the processes, then Kill it.
Lawrence

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Jul 17, 2006 1:44:12 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: A few unusual HPF2 work units

Another one to debug:
I've been running 147 hours 22 mins. Last result returned 7-11-2006.
0.0% complete.

[Jul 17, 2006 4:16:42 PM]

davidhobbs
Senior Cruncher
England
Joined: Dec 30, 2004
Post Count: 151
Status: Offline
Project Badges:

20 year badge for Human Proteome Folding

10 year badge for Human Proteome Folding - Phase 2

2 year badge for Help Cure Muscular Dystrophy

2 year badge for Discovering Dengue Drugs - Together

14 day badge for The Clean Energy Project

2 year badge for Help Fight Childhood Cancer

90 day badge for Influenza Antiviral Drug Search

180 day badge for Computing for Clean Water

1 year badge for Drug Search for Leishmaniasis

1 year badge for GO Fight Against Malaria

14 day badge for Computing for Sustainable Water

1 year badge for Uncovering Genome Mysteries

2 year badge for Outsmart Ebola Together

2 year badge for Africa Rainfall Project

2 year badge for OpenPandemics - COVID-19


Re: A few unusual HPF2 work units

Perhaps it would be a good idea for the precise method to be featured on the Start Here - FAQs forum.

I'm sure some users will need step by step details including, for example, clicking on the Processes tab after firing up task manager, and ignoring the warning prompt when killing the process.

David.

[Jul 17, 2006 5:20:15 PM]

[ ]