Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 214
Posts: 214   Pages: 22   [ Previous Page | 13 14 15 16 17 18 19 20 21 22 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 84429 times and has 213 replies Next Thread
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-09-15 Update (Networking & Workunits)

Sgt. Joe, you say that there can be multiple servers. I also said this in the post you were replying to, and none of my logic in that post depended on there being just 1 server. However, regarding caching, I now mention that there probably aren't many download servers because it might be relevant??

I just presumed 1 server in your analysis since you used the singular, but if I mis-interpreted what you meant I apologize. If the default set up for BOINC is only one download server, that may be what they have. Until the techs let us know, I would just be guessing. It would be interesting to know the topology of their set up.
Cheers
Apparently they are using two upload/download servers behind a load-balancer -- WCG technician cubes mentioned it in this post in this thread about 3 weeks ago -- I read that as meaning two systems, each running both types of service.

For what it's worth, just about any of the core BOINC system programs can be configured to run multiple copies that deal with different subsets of work units. So it would, indeed, be interesting to see some sort of reasonably detailed WCG topology map (or description thereof...)

Cheers - Al.
[Oct 15, 2022 8:32:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-09-15 Update (Networking & Workunits)

the basic issue is that no other project runs different sub-project concurrently like WCG does.

Well, Einstein@home have 5 active sub-projects, Primegrid have 25 sub-projects, LHC@home have 6 sub-projects but granted only 3 seems to have any work. Even CPDN have 10 sub-projects but granted CPDN is often out of any kind of work. I've not checked how many sub-projects the other BOINC projects have.

Meaning, having multiple active sub-projects isn't unique to WCG.
----------------------------------------


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
[Oct 15, 2022 10:33:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1950
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-09-15 Update (Networking & Workunits)

the basic issue is that no other project runs different sub-project concurrently like WCG does.

Well, Einstein@home have 5 active sub-projects, Primegrid have 25 sub-projects, LHC@home have 6 sub-projects but granted only 3 seems to have any work. Even CPDN have 10 sub-projects but granted CPDN is often out of any kind of work. I've not checked how many sub-projects the other BOINC projects have.

Meaning, having multiple active sub-projects isn't unique to WCG.
Well, for all I know, it is. I am crunching Einstein@Home myself, that is my "fallback" project at this time. And while it has different "applications", it does not really treat those with separate stats like WCG does.
I am not interested in PrimeGrid, just has a quick look at it, but at least at a cursory overview, without an account, this is the same here, so they seem to keep count of the WUs send out. But checking on the numbers of those on their home page, this is a fraction of what WCG is running in a day...

Ralf
----------------------------------------

[Oct 16, 2022 1:00:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-09-15 Update (Networking & Workunits)

A quick look on Primegrid's forums does reveal users have many different types of badges, where many of them shows how much credit users have got in sub-projects.

On users own account where's per sub-project counts of tasks done and credit and depending on sub-project can include how many primes found or factors found or something.

Since stats only uses database + web-server, a project having sub-project stats or not are completely irrelevant then it comes to separate download-server(s) ability to transfer files. Having multiple sub-projects does matter then it comes to tweaking scheduling-server(s) and how to best configure download server(s).
----------------------------------------


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
[Oct 16, 2022 2:13:45 AM]   Link   Report threatening or abusive post: please login first  Go to top 
cappucino
Cruncher
Joined: May 5, 2007
Post Count: 17
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
biggrin Re: 2022-09-15 Update (Networking & Workunits)

Seems that Rosetta only has 1 web server. Also, some BOINC projects like Universe@home only have one server that does everything for all of their applications, which is the default BOINC-server configuration. Would useful caching occur in such a single-server configuration?
I don't know U@H, so I can't say much about it. I just know that there are quite a few "home brew" projects that work with minimal resources. But then their projects are also kind of limited, at least what the number of particpants and scope of project are concerned.

And there is no problem with just having 1 web server, after all, it is just the web site and possibly the forum (I think the later MIGHT be on a separate server (instance)). And the number of web servers is certainly NOT an issue regarding the problems we experience here with WCG. That is kind of the "easy" part...
I think we disagree on the definition of symptom, but that's just semantics, so not super important. Aren't symptoms usually the observable things like a sore throat and error messages? And the root causes *can* be things that are not readily apparent like the flu or server configurations?
Yes, symptoms are those things that you can observe, and there can be multiple reasons that cause the same (or very similar) symptoms. For your example of a sore throat, possible reasons can be a cold/flu, talking for hours or eating seriously spicy food.


For the record most food that I eat on a daily basis is seriously spicy and never have an issue with a sore throat. As a matter of fact I'm having a bowl of chili right now that I made that has 1 "ghost" pepper and 1 Carolina Reaper pepper that I grew myself.
Perhaps you're thinking of smoking crack and getting a sore throat?

Yes, I have a twisted sense of humor. biggrin

As far as WCG adding in a server status page, I'd love to see one done. It's something I've found useful to just go to and quickly find out if there's something wrong with the grid or the volume of WU's that are available. The other three grids I'm involved with while WCG gets back on the good foot like James Brown, Rosetta, DENIS and TN-GRID all have them.
[Oct 17, 2022 3:04:02 AM]   Link   Report threatening or abusive post: please login first  Go to top 
erich56
Senior Cruncher
Austria
Joined: Feb 24, 2007
Post Count: 295
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-09-15 Update (Networking & Workunits)

for the past few hours, the HTTP transient error has been back sad

What a shame: downloads were going very well for several days, and now the get stuck again.

What's going on there, why did they again fiddle around with something that was working well ???
[Oct 17, 2022 6:21:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Kirel2
Advanced Cruncher
United States
Joined: Sep 24, 2014
Post Count: 99
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-09-15 Update (Networking & Workunits)

Probably because they're trying to expand the number or types of WUs going out. I'd guess it's a scaling issue.
----------------------------------------

[Oct 17, 2022 7:10:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
cz50975
Advanced Cruncher
Joined: Dec 9, 2004
Post Count: 95
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-09-15 Update (Networking & Workunits)

After several days with OPN1 WUs only and without any HTTP transient error we have this problem back. Approximately from 17:10 UTC.
May be due to MCM1 tasks which I can't download even with several <Retry Now> attempts. System not requesting any other WU due to “Not requesting tasks: some download is stalled".
[Oct 17, 2022 8:04:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
phillipspencer
Advanced Cruncher
France
Joined: Apr 9, 2015
Post Count: 71
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-09-15 Update (Networking & Workunits)

I have not seen any MCM Work Units recently but as others have noted, am now having transient errors again on OPN1. Such a shame after having been fine for a while.
[Oct 17, 2022 8:45:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1950
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-09-15 Update (Networking & Workunits)

Yup, after getting some brief UPLOAD errors with OPN1 WUs over the weekend, while i was out of the office for an hour, the old song and dance of the stuck OPNG and OPN1 WUs is back in force...

I really need to find one of those patience shops that Grumpy Swede was talking about... crying

Ralf sad

PS: Indeed, after checking some other host within my reach, I did get in fact 20 MCM1 WUs among 8 hosts. But any stuck WU of those was able to get successful retries and that large 102MB .txt file downloaded on all the hosts I could check just fine. I doubt that this is the source of the problem but it would be really nice if we could have some dialog about all this with a tech at WCG...

PS2: Now I am getting more OPNG and MCM1 WUs, but 99% of the files are getting stuck again.

WCG, enjoy it while it lasts... crying
----------------------------------------

----------------------------------------
[Edit 4 times, last edit by TPCBF at Oct 18, 2022 12:29:21 AM]
[Oct 17, 2022 11:25:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 214   Pages: 22   [ Previous Page | 13 14 15 16 17 18 19 20 21 22 | Next Page ]
[ Jump to Last Post ]
Post new Thread