Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 75
|
![]() |
Author |
|
Cyclops
Senior Cruncher Joined: Jun 13, 2022 Post Count: 295 Status: Offline |
Considering recently reported inconsistent availability of work units across all WCG projects, we wanted to provide some background.
OPN1 & OPNG As stated in our April OPN project update, the supply of OPN1 work units has run out as the OPN team re-focuses on their GPU work. 8,954 OPN1 workunits remain in progress and an additional 1,380 work units are in error states that will be redistributed. OPNG batches 185219-187768 were accelerated at the request of the researchers and the tail of work units from the accelerated batches should complete this week. The OPN team is extremely grateful for all of the support and assistance our community has given to their project. They are preparing more OPNG work units and we will make them available as soon as uploaded to our servers. MCM1 There are 32.5 days of batches for MCM1 remaining based on 1,075 batches available and an estimated rate of 33/day (28 day average = 35, 7 day average = 31). We have a steady number of MCM1 work units in reserve but they seem to be slow in distribution. To counteract this, we increased the number of threads used to create MCM1 work up on our server to increase the amount of work available. The likely cause of the slowdown is competition for CPU resources on our backend between the OPNG and MCM1 build jobs, and with more CPU resources devoted to MCM1 volunteers can now expect more MCM1 work units overall. However, we are investigating recent failure scenarios of the transitioner daemons which sometimes cause interruptions to all workunit distribution. SCC1 We observed a problem in the distributions of SCC1 work units but it has since been solved and they should be flowing normally. ARP1 We are unsure as to why ARP1 work units seem to be in short supply and we continue investigating the issue. If you have any comments or questions, please leave them in this thread for us to answer. WCG team |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7655 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thank you for the update. Remember, don't be shy about keeping the volunteer corps up to date on the back office operations. An informed volunteer tends to be a satisfied volunteer.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 945 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Cyclops,
Thanks for the update, and for confirming that WCG will no longer have any work for ARM-based systems such as Raspberry Pi and other SBCs... I understand why the OPN people are going to concentrate on GPU work, but it's a bit sad to see CPU-only users being told "No more..." I'll continue watching in the hope that MCM1 might get ported [again] at some point, as was the case during the Beta test that started in July 2020 but didn't seem to result in a version change at the time (or an ARM application :-( ...) Again, thanks for clarity... Cheers - Al. |
||
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 796 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for the update. Have a great weekend.
----------------------------------------
|
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 946 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() |
Thank you for the update ! I'm so happy to have ARP WUs again. I hope they can keep them flowing.
|
||
|
uglyphilbert
Cruncher Joined: Mar 11, 2017 Post Count: 17 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() |
Thanks for the update, appreciate it.
SCC work units are dribbling in 1 or 2 at a time for me, I'll assume it's just building up slowly. On a side note, I reinstalled windows on one of my devices and it now recognises it as a new device. Is this normal? The device took a few weeks to appear without me contacting anyone about it so maybe the new device issue has been solved. |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7655 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I did get 1 ARP. The SCC units are very slowly becoming available, dribbling in like molasses in January. MCM supply appears to be adequate. The transient HTTP errors are appearing haphazardly. Progress is happening albeit not nearly as quickly as we would like.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 796 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for the update, appreciate it. SCC work units are dribbling in 1 or 2 at a time for me, I'll assume it's just building up slowly. On a side note, I reinstalled windows on one of my devices and it now recognises it as a new device. Is this normal? The device took a few weeks to appear without me contacting anyone about it so maybe the new device issue has been solved. There's documentation somewhere on the WCG website that describes how WCG deviates from standard BOINC to do "host" matching. In other words, instead of matching by hostID, it tries to match by the device's username. It's possible your Windows reinstall was named differently? Or perhaps the username and/or exact build version of Windows didn't match? Kind of wish WCG would go back to standard BOINC, but a lot of custom development work was done under IBM WCG, and it would take a lot of dedicated resources to undo all that customization work. I have some duplicate device IDs that I would love to "merge," and some with zero results that I would like to delete.
|
||
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 796 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Would it be possible for a technical WCG employee to provide us a very technical explanation of the current WCG architecture? I'm curious about all server hardware used for work unit generation, processing, sending, receiving, and storage. Including any load balancers or proxies used, internal NICs used, internal network speeds, and external ISP throughputs.
----------------------------------------The more detail the better, for those of us IT Professionals (or advanced enthusiasts). I'm just trying to understand the bottlenecks and the exact root cause of the transient HTTP errors for both uploading and downloading. Heck, a network diagram would be best + a description in writing. Ideally a WCG tech can post in the forum directly instead of relaying through a middleman.
[Edit 1 times, last edit by hchc at May 6, 2023 10:01:12 PM] |
||
|
SKEPTICINFORMED
Cruncher Joined: Jul 12, 2015 Post Count: 31 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm using an iMac with an M1 chip and the latest OS. I'm finding that the vast majority of my MCM WUs do NOT complete, reverting back to 0% after completing between 1%-1.4% and then restarting and ending within the same range over and over again, until I abort the units. I've been told that this is only affecting a minority of WCG Users. However, I hope it gets fixed ASAP, because I've contributed to MCM for many years and would love to be able to continue contributing to it. All non-MCM WUs are working normally.
|
||
|
|
![]() |