Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 822
Posts: 822   Pages: 83   [ Previous Page | 6 7 8 9 10 11 12 13 14 15 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 960706 times and has 821 replies Next Thread
BladeD
Ace Cruncher
USA
Joined: Nov 17, 2004
Post Count: 28976
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work unit availability

Someone released some work for the night shift. wink
----------------------------------------
[Apr 8, 2021 5:40:16 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Speedy51
Veteran Cruncher
New Zealand
Joined: Nov 4, 2005
Post Count: 1297
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work unit availability

I like to see all GPU work units use replication=2 or higher, if possible. Looks like we have lots of GPU waiting for GPU tasks.

Sometimes slightly annoyed to see most of my OPNG work units are replication=1 and my GPU is idling 80% of the time.

My understanding is if you see work higher than _1 this is not a good sign because it means either there is an issue with the work unit or the work unit has failed on multiple GPU's
----------------------------------------

[Apr 8, 2021 6:12:35 AM]   Link   Report threatening or abusive post: please login first  Go to top 
goben_2003
Advanced Cruncher
Joined: Jun 16, 2006
Post Count: 145
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work unit availability

I like to see all GPU work units use replication=2 or higher, if possible. Looks like we have lots of GPU waiting for GPU tasks.

Sometimes slightly annoyed to see most of my OPNG work units are replication=1 and my GPU is idling 80% of the time.

My understanding is if you see work higher than _1 this is not a good sign because it means either there is an issue with the work unit or the work unit has failed on multiple GPU's

I think that they are referring to when only _0 is required such as:
Name: 			OPNG_0000485_00204
Minimum Quorum: 1
Replication: 1

The only ones of mine that are replication of 2 are when I am assigned a _1 as a wingman for a device that is not yet a reliable host.

If the science does not need a replication of 2, then setting it to replication of 2 would be sending out busy work. It is bad to send out busy work as it is not actually free for volunteers to contribute(electricity, etc).
----------------------------------------

[Apr 8, 2021 8:07:17 AM]   Link   Report threatening or abusive post: please login first  Go to top 
kittyman
Advanced Cruncher
Joined: May 14, 2020
Post Count: 140
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work unit availability

Well, the kitties have not seen another OPNG since the single one they got and munched yesterday. So, we'll try again............
Kitties haz OPNG, pleeze?
----------------------------------------

[Apr 8, 2021 10:31:04 AM]   Link   Report threatening or abusive post: please login first  Go to top 
sam6861
Advanced Cruncher
Joined: Mar 31, 2020
Post Count: 107
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work unit availability

If the science does not need a replication of 2, then setting it to replication of 2 would be sending out busy work.
Replication of 2 allows data to verify that they match. Replication of 1... some bad or broken data can randomly and silently get past it and show up as "Valid" as there are no wingman computer to verify result. GPU Calculation errors sometmes caused by overclock, undervolt, automatic updates to windows drivers, blown capacitor or just failing hardware.

Looking at some results containing replication=2 and invalid, there appears to be a bug? When data don't match, 2 more tasks was sent instead of 1 more task, then later 1 goes server abort. I counted only those with replication=2, 50 OPNG work units total, 5 work units have an invalid and was resent. Here are 3 of them.

https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=606928354
OPNG_0000133_00275
OS Type Status Sent Time Return Time Time Claimed/Granted
0 Win10 Core Invalid 4/6/21 18:32:30 4/6/21 18:43:32 0.05 72.8 / 72.8
1 Win10 Core Error 4/6/21 18:32:32 4/7/21 16:39:07 0.00 72.8 / 0.0
2 Win10 Core Valid 4/7/21 16:39:11 4/8/21 04:54:51 0.04 0.5 / 1,029.8
3 Win10 Pro Valid 4/8/21 04:55:00 4/8/21 04:59:11 0.02 0.5 / 1,045.5
4 Win10 Core Valid 4/8/21 04:55:02 4/8/21 05:03:24 0.05 0.3 / 1,029.8

https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=608243626
OPNG_0000350_00045
0 Win10 Core Invalid 4/7/21 09:57:42 4/7/21 10:47:47 0.03 0.3 / 0.3
1 Win10 Core Valid 4/7/21 09:57:49 4/7/21 15:26:11 0.03 0.4 / 1,046.2
2 Win10 Core Valid 4/7/21 15:26:24 4/7/21 15:49:14 0.03 0.4 / 963.4
3 Win10 Core Server Abort 4/7/21 15:26:25 4/7/21 16:16:03 0.00 72.8 / 0.0

https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=607653744
OPNG_0000164_00083
0 Linux Linuxmint Invalid 4/7/21 01:29:27 4/7/21 10:54:52 0.02 0.7 / 0.7
1 Linux Ubuntu Valid 4/7/21 01:29:52 4/7/21 03:13:44 0.02 0.5 / 856.0
2 Linux Debian Server Abort 4/7/21 10:55:04 4/7/21 11:22:12 0.00 0.0 / 0.0
3 Linux Debian Valid 4/7/21 10:56:06 4/7/21 11:19:32 0.03 0.7 / 763.4

[Apr 8, 2021 11:15:03 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Dennis-TW
Cruncher
Joined: Apr 28, 2010
Post Count: 13
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work unit availability

Replication of 2 allows data to verify that they match. Replication of 1... some bad or broken data can randomly and silently get past it and show up as "Valid" as there are no wingman computer to verify result. GPU Calculation errors sometmes caused by overclock, undervolt, automatic updates to windows drivers, blown capacitor or just failing hardware.

Please check this statement from Uplinger with regard on validation.

FWIW, I think it is more safe to trust the WCG Tech Team in the way they are handling things than the average user like us with no insight at all.
[Apr 8, 2021 11:50:50 AM]   Link   Report threatening or abusive post: please login first  Go to top 
sam6861
Advanced Cruncher
Joined: Mar 31, 2020
Post Count: 107
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work unit availability

Please check this statement from Uplinger with regard on validation. FWIW, I think it is more safe to trust the WCG Tech Team in the way they are handling things than the average user like us with no insight at all.
Ok, so that partly explains about just having replication=1.

But is those single invalid double resend, replication=3, and server abort / triple valid a bug? This can possibly go from "yay got a few GPU task" to "No, server aborted". So far, OPNG server abort didn't happen to my computer yet.
[Apr 8, 2021 12:44:28 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work unit availability

But is those single invalid double resend, replication=3, and server abort / triple valid a bug?

It's not a bug. It's the way the server is set up. An invalid sends out 2 more copies of the task for comparison. First 1 returned that matches is valid. No need for a second so the server aborts it. In some cases both tasks have started before 1 is returned. In that case both the resends can be valid. Hope this clears up the subject.
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


----------------------------------------
[Edit 3 times, last edit by nanoprobe at Apr 8, 2021 1:02:20 PM]
[Apr 8, 2021 12:49:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
sam6861
Advanced Cruncher
Joined: Mar 31, 2020
Post Count: 107
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work unit availability

An invalid sends out 2 more copies of the task for comparison.
I guess OPNG is different from ARP1 where their 1 invalid just send 1 more copy.

Now on to work unit availability, I do get a few OPNG tasks, about 10 per day, with the use of report_results_immediately in app_config.xml while running CPU tasks to get more work more frequently.
----------------------------------------
[Edit 1 times, last edit by sam6861 at Apr 8, 2021 1:45:27 PM]
[Apr 8, 2021 1:45:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 274
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work unit availability

Just noticed a few more OPNG units sailed by while I was asleep... not as many as yesterday. I'm glad to see more coming, even if they're few and far between.
[Apr 8, 2021 1:48:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 822   Pages: 83   [ Previous Page | 6 7 8 9 10 11 12 13 14 15 | Next Page ]
[ Jump to Last Post ]
Post new Thread