World Community Grid - View Thread - For Current Volunteers: Advance Information on Our Newest Project

World Community Grid Forums

Category: Active Research

Forum: Africa Rainfall Project

Thread: For Current Volunteers: Advance Information on Our Newest Project

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 164

[ ]

Author

This topic has been viewed 53227 times and has 163 replies

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: For Current Volunteers: Advance Information on Our Newest Project

Yeah, ARP1 really should have an active core cap. We're here to progress science, not slow it down, which is the consequence... a task sitting in queue for at least 30 hours... a task holding up the next step 30 hours.

I feel this. I've got plenty of Zen+/Zen2 cores over here, ready to crank on this problem. I learned my lesson about oversubscribing machines with MIP1 (what up, cache thrashing), and I was part of the ARP beta, so I feel pretty comfortable with my settings and I know how long these WUs will run on my hardware.

It's frustrating to be excited about a new, super important project, and then discover that you're not going to be able to help very much because all the WUs went to people who happened to be awake before you were on day 1.

I agree 100% All the hype associated with this project and then it is so restricted one can barely participate. It looks like another HSTB project. The memory requirements aren't that much different than FAHB or MIP1. The client network bandwidth information is available to the server so why not send more work to the clients that have the network bandwidth to handle it? I was going to dedicate 64 EPYC cores and 128GB of memory to this project but it doesn't look like I will be able to get more than about 5 work units. Don't have this type of restriction on ClimatePrediction and their work units use same amount or greater memory and have transfer files that are almost 100MB in size. What's the difference? If the file transfers cause issues for members, let the members restrict the number of WUs to fit their situation instead of a blanket restriction that penalizes everybody. Don't remember a restriction during beta testing....

[Oct 31, 2019 5:29:26 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: For Current Volunteers: Advance Information on Our Newest Project

WCG needs to fix that if they want this project to run quickly. Server abort should occur if ARP1 WU cannot be returned in 48 hours.

A task started is a task left alone to finish even if it's overdue. If there were some kind of trickle signal back to the server, at least the server would not send out a wasted extra copy. At some point in time the client was smart enough that 'will not finish in time' and the task being cancelled even before deadline, but there's just multiple capabilities that WCG has not chosen to employ.

I'm surprised the 'trickle' method was not set up for this like at CPDN. Every checkpoint is uploaded. If it then crashes, another client can just pick up and finish the remaining steps, and the one who did the initial steps gets credit for their piece of time. A trickle then serves to let the project know 'it's alive and being worked on'. Maybe the limited scale did not justify that effort, just the raw blunt 'easy to maintain' model. Crash at 99%, bye bye.

[Oct 31, 2019 7:15:11 PM]

Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

180 day badge for Discovering Dengue Drugs - Together

1 year badge for Nutritious Rice for the World

1 year badge for The Clean Energy Project

2 year badge for Help Fight Childhood Cancer

180 day badge for Influenza Antiviral Drug Search

2 year badge for Help Cure Muscular Dystrophy - Phase 2

1 year badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for The Clean Energy Project - Phase 2

2 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

20 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

5 year badge for Outsmart Ebola Together

5 year badge for FightAIDS@Home - Phase 2

10 year badge for Microbiome Immunity Project

5 year badge for Africa Rainfall Project

20 year badge for OpenPandemics - COVID-19


Re: For Current Volunteers: Advance Information on Our Newest Project

The client network bandwidth information is available to the server so why not send more work to the clients that have the network bandwidth to handle it?

Transferring small files gives very low measured bandwidth in BOINC, meaning the actual bandwidth can be 10x - 100x the bandwidth reported by BOINC. Basing distribution of work on such unreliable measurements is not a good idea.

----------------------------------------

"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."

[Oct 31, 2019 11:43:11 PM]

vaughan-AMD
Cruncher
Australia
Joined: Nov 19, 2004
Post Count: 25
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

90 day badge for Discovering Dengue Drugs - Together

14 day badge for Nutritious Rice for the World

1 year badge for Help Fight Childhood Cancer

180 day badge for Help Cure Muscular Dystrophy - Phase 2

1 year badge for Computing for Clean Water

1 year badge for Drug Search for Leishmaniasis

1 year badge for GO Fight Against Malaria

14 day badge for Computing for Sustainable Water

100 year badge for Mapping Cancer Markers

2 year badge for Uncovering Genome Mysteries

10 year badge for FightAIDS@Home - Phase 2

10 year badge for Smash Childhood Cancer

20 year badge for Microbiome Immunity Project

50 year badge for OpenPandemics - COVID-19


Re: For Current Volunteers: Advance Information on Our Newest Project

Aurum420: How do you get such a cheap electric bill? Mine is over $5000 a quarter.

----------------------------------------
[Edit 1 times, last edit by vaughan-AMD at Nov 1, 2019 1:15:36 AM]

[Nov 1, 2019 1:14:26 AM]

Aurum
Master Cruncher
The Great Basin
Joined: Dec 24, 2017
Post Count: 2386
Status: Offline
Project Badges:

200 year badge for Mapping Cancer Markers

50 year badge for Outsmart Ebola Together

100 year badge for FightAIDS@Home - Phase 2

100 year badge for Smash Childhood Cancer

100 year badge for Microbiome Immunity Project

100 year badge for Africa Rainfall Project

200 year badge for OpenPandemics - COVID-19


Re: For Current Volunteers: Advance Information on Our Newest Project

vaughan-AMD, Our electric rate is $0.07/kWh. Lucky for me my wife never sees the utility bill.

----------------------------------------

...KRI please cancel all shadow-banning

[Nov 1, 2019 1:19:39 AM]

l_mckeon
Senior Cruncher
Joined: Oct 20, 2007
Post Count: 439
Status: Offline
Project Badges:

180 day badge for Human Proteome Folding - Phase 2

14 day badge for Discovering Dengue Drugs - Together

90 day badge for Nutritious Rice for the World

180 day badge for Help Fight Childhood Cancer

14 day badge for Influenza Antiviral Drug Search

180 day badge for Drug Search for Leishmaniasis

180 day badge for GO Fight Against Malaria

90 day badge for Computing for Sustainable Water

14 day badge for Uncovering Genome Mysteries

2 year badge for Outsmart Ebola Together

90 day badge for FightAIDS@Home - Phase 2

180 day badge for Microbiome Immunity Project

2 year badge for OpenPandemics - COVID-19


Re: For Current Volunteers: Advance Information on Our Newest Project

How much RAM is a single task actually consuming?

Is this another project that runs at vastly different speeds on Windows and Linux?

[Nov 1, 2019 1:29:54 AM]

hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 802
Status: Offline
Project Badges:

45 day badge for Help Cure Muscular Dystrophy

1 year badge for Outsmart Ebola Together

5 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project

10 year badge for OpenPandemics - COVID-19


Re: For Current Volunteers: Advance Information on Our Newest Project

I'm getting about this much per task:

~800 MB RAM consumption
~1-1.5 GB disk space
~85-90 MB upload

And yeah, much faster on Linux than Windows for me. My Ivy Bridge (3rd Gen) outperforms my Coffee Lake (8th Gen).

----------------------------------------

i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

----------------------------------------
[Edit 1 times, last edit by hchc at Nov 1, 2019 8:28:48 AM]

[Nov 1, 2019 7:11:09 AM]

hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 802
Status: Offline
Project Badges:


Re: For Current Volunteers: Advance Information on Our Newest Project

uplinger said:

Also, we do get machines that are set to run results from other projects but those hosts have already been limited to 1 result per day... They start with say 5 results per day, but once they return errors, it drops down to 1.

What I meant was since this project is opt-in, that means that the owner of all those enterprise devices deliberately opted into the project knowing that their fleet of devices on 7.2.47 (from 2014) would error out.

----------------------------------------

i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

----------------------------------------
[Edit 2 times, last edit by hchc at Nov 1, 2019 7:57:58 AM]

[Nov 1, 2019 7:39:55 AM]

catchercradle
Advanced Cruncher
Joined: Jan 16, 2009
Post Count: 128
Status: Offline
Project Badges:

14 day badge for Drug Search for Leishmaniasis

180 day badge for Africa Rainfall Project

14 day badge for OpenPandemics - COVID-19


Re: For Current Volunteers: Advance Information on Our Newest Project

No great surprise there for those of us who crunch for CPDN. GPUs strength is in doing lots of computations simultaneously. The nature of weather tasks is that each computation is based on the results of the previous one so little if anything to be gained by running on GPU. The same logic means they wouldn't scale well as multi-core tasks.

[Nov 1, 2019 8:47:52 AM]

Eric_Kaiser
Veteran Cruncher
Germany (Hessen)
Joined: May 7, 2013
Post Count: 1047
Status: Offline
Project Badges:

10 year badge for The Clean Energy Project - Phase 2

20 year badge for Uncovering Genome Mysteries

100 year badge for Outsmart Ebola Together

20 year badge for FightAIDS@Home - Phase 2

100 year badge for OpenPandemics - COVID-19


Re: For Current Volunteers: Advance Information on Our Newest Project

WCG needs to fix that if they want this project to run quickly. Server abort should occur if ARP1 WU cannot be returned in 48 hours.

I don't second that. I have ARP1 that are now running 44 hours and still not finished. If the server aborts them energy and ressources are wasted.
In the end I am not going to support projects that waste ressources.

----------------------------------------

[Nov 1, 2019 8:49:39 AM]

[ ]