Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Thread Type: Sticky Thread
Total posts in this thread: 75
Posts: 75   Pages: 8   [ 1 2 3 4 5 6 7 8 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 12745 times and has 74 replies Next Thread
Cyclops
Senior Cruncher
Joined: Jun 13, 2022
Post Count: 295
Status: Offline
Reply to this Post  Reply with Quote 
May 2023 workunit update

Considering recently reported inconsistent availability of work units across all WCG projects, we wanted to provide some background.

OPN1 & OPNG
As stated in our April OPN project update, the supply of OPN1 work units has run out as the OPN team re-focuses on their GPU work. 8,954 OPN1 workunits remain in progress and an additional 1,380 work units are in error states that will be redistributed. OPNG batches 185219-187768 were accelerated at the request of the researchers and the tail of work units from the accelerated batches should complete this week. The OPN team is extremely grateful for all of the support and assistance our community has given to their project. They are preparing more OPNG work units and we will make them available as soon as uploaded to our servers.

MCM1
There are 32.5 days of batches for MCM1 remaining based on 1,075 batches available and an estimated rate of 33/day (28 day average = 35, 7 day average = 31).
We have a steady number of MCM1 work units in reserve but they seem to be slow in distribution. To counteract this, we increased the number of threads used to create MCM1 work up on our server to increase the amount of work available. The likely cause of the slowdown is competition for CPU resources on our backend between the OPNG and MCM1 build jobs, and with more CPU resources devoted to MCM1 volunteers can now expect more MCM1 work units overall. However, we are investigating recent failure scenarios of the transitioner daemons which sometimes cause interruptions to all workunit distribution.

SCC1
We observed a problem in the distributions of SCC1 work units but it has since been solved and they should be flowing normally.

ARP1
We are unsure as to why ARP1 work units seem to be in short supply and we continue investigating the issue.

If you have any comments or questions, please leave them in this thread for us to answer.

WCG team
[May 5, 2023 10:33:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7237
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: May 2023 workunit update

Thank you for the update. Remember, don't be shy about keeping the volunteer corps up to date on the back office operations. An informed volunteer tends to be a satisfied volunteer.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[May 5, 2023 11:51:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 738
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: May 2023 workunit update

Cyclops,

Thanks for the update, and for confirming that WCG will no longer have any work for ARM-based systems such as Raspberry Pi and other SBCs... I understand why the OPN people are going to concentrate on GPU work, but it's a bit sad to see CPU-only users being told "No more..."

I'll continue watching in the hope that MCM1 might get ported [again] at some point, as was the case during the Beta test that started in July 2020 but didn't seem to result in a version change at the time (or an ARM application :-( ...)

Again, thanks for clarity...

Cheers - Al.
[May 6, 2023 12:20:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 735
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: May 2023 workunit update

Thanks for the update. Have a great weekend.
----------------------------------------
  • i3-8100 (Coffee Lake, 4C/4T) @ 3.6 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • E5800 (Wolfdale, 2C/2T) @ 3.2 GHz

[May 6, 2023 12:42:14 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 742
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: May 2023 workunit update

Thank you for the update ! I'm so happy to have ARP WUs again. I hope they can keep them flowing.
[May 6, 2023 1:10:12 AM]   Link   Report threatening or abusive post: please login first  Go to top 
uglyphilbert
Cruncher
Joined: Mar 11, 2017
Post Count: 9
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: May 2023 workunit update

Thanks for the update, appreciate it.

SCC work units are dribbling in 1 or 2 at a time for me, I'll assume it's just building up slowly.

On a side note, I reinstalled windows on one of my devices and it now recognises it as a new device. Is this normal? The device took a few weeks to appear without me contacting anyone about it so maybe the new device issue has been solved.
[May 6, 2023 4:25:39 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7237
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: May 2023 workunit update

I did get 1 ARP. The SCC units are very slowly becoming available, dribbling in like molasses in January. MCM supply appears to be adequate. The transient HTTP errors are appearing haphazardly. Progress is happening albeit not nearly as quickly as we would like.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[May 6, 2023 12:52:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 735
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: May 2023 workunit update

Thanks for the update, appreciate it.

SCC work units are dribbling in 1 or 2 at a time for me, I'll assume it's just building up slowly.

On a side note, I reinstalled windows on one of my devices and it now recognises it as a new device. Is this normal? The device took a few weeks to appear without me contacting anyone about it so maybe the new device issue has been solved.


There's documentation somewhere on the WCG website that describes how WCG deviates from standard BOINC to do "host" matching. In other words, instead of matching by hostID, it tries to match by the device's username. It's possible your Windows reinstall was named differently? Or perhaps the username and/or exact build version of Windows didn't match?

Kind of wish WCG would go back to standard BOINC, but a lot of custom development work was done under IBM WCG, and it would take a lot of dedicated resources to undo all that customization work.

I have some duplicate device IDs that I would love to "merge," and some with zero results that I would like to delete.
----------------------------------------
  • i3-8100 (Coffee Lake, 4C/4T) @ 3.6 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • E5800 (Wolfdale, 2C/2T) @ 3.2 GHz

[May 6, 2023 9:52:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 735
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: May 2023 workunit update

Would it be possible for a technical WCG employee to provide us a very technical explanation of the current WCG architecture? I'm curious about all server hardware used for work unit generation, processing, sending, receiving, and storage. Including any load balancers or proxies used, internal NICs used, internal network speeds, and external ISP throughputs.

The more detail the better, for those of us IT Professionals (or advanced enthusiasts). I'm just trying to understand the bottlenecks and the exact root cause of the transient HTTP errors for both uploading and downloading.

Heck, a network diagram would be best + a description in writing.

Ideally a WCG tech can post in the forum directly instead of relaying through a middleman.
----------------------------------------
  • i3-8100 (Coffee Lake, 4C/4T) @ 3.6 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • E5800 (Wolfdale, 2C/2T) @ 3.2 GHz

----------------------------------------
[Edit 1 times, last edit by hchc at May 6, 2023 10:01:12 PM]
[May 6, 2023 9:59:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
SKEPTICINFORMED
Cruncher
Joined: Jul 12, 2015
Post Count: 31
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: May 2023 workunit update

I'm using an iMac with an M1 chip and the latest OS. I'm finding that the vast majority of my MCM WUs do NOT complete, reverting back to 0% after completing between 1%-1.4% and then restarting and ending within the same range over and over again, until I abort the units. I've been told that this is only affecting a minority of WCG Users. However, I hope it gets fixed ASAP, because I've contributed to MCM for many years and would love to be able to continue contributing to it. All non-MCM WUs are working normally.
[May 6, 2023 10:04:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 75   Pages: 8   [ 1 2 3 4 5 6 7 8 | Next Page ]
[ Jump to Last Post ]
Post new Thread