Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 30
Posts: 30   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 6005 times and has 29 replies Next Thread
Cyclops
Senior Cruncher
Joined: Jun 13, 2022
Post Count: 295
Status: Offline
Reply to this Post  Reply with Quote 
2022-11-04 Update (ARP units & Device Manager issues)

Hi everyone,

Since testing and system updates resulted in steady flow of workunits we may be able to start expanding the projects. As reported earlier, SCC and HSTB projects are busy with validation and preparing for the new restart. We are happy to report that the ARP project is finalizing storage and network setup to enable restart. We will provide a more detailed account of the situation directly from the ARP team soon.

On the backend side, we have been addressing a device manager issue some volunteers have run into. Due to a communication error between our BOINC and website databases, some devices are listed in a volunteer’s Results list while being absent from their Devices. We’ve added this to our Comprehensive Bug List and discussed it in this forum thread.

Please leave any questions or proposals in this thread instead of making a new thread. Thank you for your support, patience and understanding.

WCG team at Krembil Research Institute

(Edit) A brief addendum on the ARP workunits from the tech team:
The ARP1 team is in the middle of a large-scale backup of existing results to tape and as a result have not been able to download additional results from our servers. There is a "maximum unsent results" threshold in our ARP1 workunit-management system that prevents the system from downloading more work if too many unsent results accumulate in our system. Unsent results piled up on our side past that threshold, preventing new downloads of ARP1 work. WCG systems ordinarily keep enough ARP1 work in reserve to last 5 days, but our BOINC server has since consumed it all, distributing it to WCG members' devices. Following a discussion with the ARP1 team, today we increased that threshold enough to allow WCG servers to download some new work, which started flowing out to members earlier today.
----------------------------------------
[Edit 1 times, last edit by Cyclops at Nov 4, 2022 5:51:47 PM]
[Nov 4, 2022 3:13:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Kirel2
Advanced Cruncher
United States
Joined: Sep 24, 2014
Post Count: 99
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-11-04 Update (ARP units & Device Manager issues)

Thanks for the update!
----------------------------------------

[Nov 4, 2022 3:41:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1951
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-11-04 Update (ARP units & Device Manager issues)

Thanks for the update, I just still would hope that they would come more often, specially when you guys are testing something, so it is easier to give more qualified feedback from our end.

That said, the flow of WUs doesn't seem to me to be THAT steady, in particular OPNG seems to come more in small squirts, though I am not sure if this is due to some pointwh***** immediately pouncing on any available GPU work.

I also see occasionally some upload errors, where the BOINC client sits for more than a minute in "upload active" before going into retry mode. A manual retry however seems to succeed in general, though still a bit slow considering the rather small file size...

EDIT: Just noticed about a dozen MCM1 WUs being stuck on download, but they all went through on selecting [Retry now], once...

Ralf
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by TPCBF at Nov 4, 2022 5:02:33 PM]
[Nov 4, 2022 3:57:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
mdxi
Advanced Cruncher
Joined: Dec 6, 2017
Post Count: 109
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-11-04 Update (ARP units & Device Manager issues)

the flow of WUs doesn't seem to me to be THAT steady

As a contrasting datapoint, today is the 8th straight day that all my machines have had full queues while crunching WUs nonstop.

That said, I'm not currently crunching on GPU for WGC, so I'm only getting OPN1 and MCM1 WUs. But there seem to be plenty of those.

Very glad to hear that SCC is coming back!
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by mdxi at Nov 4, 2022 5:55:01 PM]
[Nov 4, 2022 5:54:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1951
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-11-04 Update (ARP units & Device Manager issues)

Well, whadda you know!

Just in time for the weekend, OPNG download errors are back.

Just came back from a meeting and noticed that my programming laptop has a new OPNG WU crunching.Checked what came in and noticed that out of a batch of 20 new jobs, 1 came in to the point that it could be worked on, the other 95 or so files are stuck in download retries again. And other than with those few MCM1 retries I noticed earlier this morning, the current ones seem to need more than on kick in the pants to go through... sad

Ralf
----------------------------------------

[Nov 4, 2022 6:17:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
PMH_UK
Veteran Cruncher
UK
Joined: Apr 26, 2007
Post Count: 772
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-11-04 Update (ARP units & Device Manager issues)

Download issues ramped up when ARP units were made available again.

Website also unavailable for a while.

Paul.
----------------------------------------
Paul.
[Nov 4, 2022 7:43:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1951
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-11-04 Update (ARP units & Device Manager issues)

Not seeing any ARP1 WUs, but this crash was kind of announcing itself.

First occasional upload errors, followed by occasional MCM1 download errors.
Then more substantial OPNG download errors, followed in short order by a complete web site/forum outage...

Ralf crying
----------------------------------------

[Nov 4, 2022 8:01:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Jean-David Beyer
Senior Cruncher
USA
Joined: Oct 2, 2007
Post Count: 337
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-11-04 Update (ARP units & Device Manager issues)

Not seeing any ARP1 WUs, but this crash was kind of announcing itself.

First occasional upload errors, followed by occasional MCM1 download errors.
Then more substantial OPNG download errors, followed in short order by a complete web site/forum outage...


I got a big bunch of downloads early this afternoon. 4 ARP1, 4 OPN1, and 4 MCM1 work units. They all had trouble downloading, but I did a lot of Retrys Six of them are now running on my machine.

(I also got 24 Rosetta work units.)

(As usual, no ClimatePrediction work units.)
----------------------------------------

[Nov 4, 2022 8:37:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Gretar
Cruncher
Iceland
Joined: Dec 28, 2008
Post Count: 23
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-11-04 Update (ARP units & Device Manager issues)

Thanks for the update.
[Nov 4, 2022 9:37:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
kraftcore
Cruncher
Sverige
Joined: Jan 22, 2016
Post Count: 10
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-11-04 Update (ARP units & Device Manager issues)

Starts to feel like a tradition on HTTP error when weekend starts haha!

Got one ARP WU and all OPN WU are resends, even a third resend?
----------------------------------------

[Nov 4, 2022 9:46:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 30   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread