Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 74
Posts: 74   Pages: 8   [ Previous Page | 1 2 3 4 5 6 7 8 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 579803 times and has 73 replies Next Thread
MJH333
Senior Cruncher
England
Joined: Apr 3, 2021
Post Count: 268
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-07-31 Update (MCM1 issue resolved)

So does anybody know where that irksome 1000 job limit is exactly in the source?
thunder7,
I believe it is part of the Boinc client code, which you could potentially get round if you compiled your own client. See this thread on the Boinc forum.
See also this FAQ on the WCG help pages which indicates that there is a (server-side?) limit per core.
Cheers,
Mark
[Aug 8, 2023 8:10:24 AM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 812
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-07-31 Update (MCM1 issue resolved)

thunder7 said:

So does anybody know where that irksome 1000 job limit is exactly in the source? My larger machine just can't download enough to keep being active over the (all too frequent) bumps in the road here. The last 55 jobs being crunched by 88 cpus all show a report deadline of August the 11th, 23:45, so it should be possible to download more and still return them on time. I feel I'm at a disadvantage with one big machine compared to many smaller ones, each downloading a 1000 jobs.

There is MAX_WU_RESULTS (which is at 100?), SELECT_LIMIT, QUERY_LIMIT, MAX_JOBS, WF_MAX_RUNNABLE_JOBS, to name a few.


I made an Issues thread on the official BOINC GitHub repository back in 2019. This might help, but I haven't thought about this in a few years.

Improve logic behind WF_MAX_RUNNABLE_JOBS = 1000

There's also the official BOINC Message boards that are fairly active, and a lot of the core volunteer developers/founders are there too.

To answer your question, I believe it's WF_MAX_RUNNABLE_JOBS, which must be a constant hard-coded to 1000 somewhere, so just changing that constant might be a workaround. It's not pretty, but neither is the original limit. [Edit: I'm not 100% sure if that's the constant that needs to be changed to be honest, so let us all know if you wanna build your own BOINC :) ]
----------------------------------------
  • i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

----------------------------------------
[Edit 1 times, last edit by hchc at Aug 8, 2023 8:12:44 AM]
[Aug 8, 2023 8:10:45 AM]   Link   Report threatening or abusive post: please login first  Go to top 
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1677
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-07-31 Update (MCM1 issue resolved)

Hi thunder7,
even with 6, 8, or 16 threads machines, and 2 day "large" buffer, my machines run regularly dry.
A couple of years ago, I planned to replace my oldest with more powerful machines and then came the migration to Krembil.
The poor motivation showed by the Krembil team causes that I did not invest anything until now. Why should I do it, if there is no WUs to crunch?
Just maintaining machines on-line "in case of" is simply wasting energy.
Krembil's behaviour is totally disrespectful to the WCG active members. At the end of the month, we have to pay the bill and the efficiency ratio (kW vs done work) since the move from IBM is definitively catastrophic.
My concern is that many people left. On my side, during the "best years" my contribution to my team was about 3%. Last week, even with my old poor performing machines, it was about 30%. In other words, many people left and the available computational power shrink dramatically.
This situation is also the reason for me to strongly limit my participation to the forum; it is simply completely demotivating and frustrating.
Cheers,
Yves
----------------------------------------
[Aug 8, 2023 8:22:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
James C. Owens
Cruncher
Joined: Sep 9, 2006
Post Count: 2
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-07-31 Update (MCM1 issue resolved)

Hi. This is actually quite important. Those stats exports are used by stats aggregration sites such as stats.free-dc.org and boincstats.com, as well as Gridcoin's stats collection. You should get an ETA.

Hi Aperture_Science_Innovators,

It seems that this is one of the interfaces that is not yet functioning as it was before the outage. Our data centre contact is investigating why the interface is not working so that the upload/download server can serve files from the filesystem where the stats reside. Unfortunately we are unable to provide a time frame at this point.

----------------------------------------
[Edit 1 times, last edit by James C. Owens at Aug 8, 2023 12:34:14 PM]
[Aug 8, 2023 12:33:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bluestang
Senior Cruncher
USA
Joined: Oct 1, 2010
Post Count: 272
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-07-31 Update (MCM1 issue resolved)

I am not a big fan of the outages, but not all of them are Krembil's fault. At least one has been the data center. I do agree they probably bi off more than they can chew, but at least they are giving it a try. The alternative was probably to just shut down the project. I would suspect the search is continuing for partner(s) to bolster their workforce and expertise and reliability.
Cheers
Sorry Sarge, but here is already where the basic problem starts.

In his one and only reply after the hardware crash in Feb/March, Dr.Jurisica stated that Krembil isn't involved in WCG AT ALL! Despite plastering their name all over the place. Apparently, it is UHN, which has signed at the bottom line, and is the entity dealing with any donations as well.. I have yet to see him come back and provide some more (honest!) details about Krembil's involvement.

Yes, stuff can happen, nobody is contesting that. But unfortunately, there are far too many fancy stories, that don't make any sense, are being brought up over the last 14-15 months.The last two, the supposed "cluster of 260 Macs" and the "DHCP client failure" (even if this is a typo and was supposed to be "DHCP server") just don't make any sense. How can Marist college still participate without any noticeable interference and a mere 260 hosts are causing the system to run out of WUs? And the "data center outage", for which I can't find a single hint on any of the IT related sites and blogs?

And how can they seriously expect to find any one willing to put up money for the project if they are so lack luster and dishonest in their communication? If they are communicating in the first place! This is something that costs very little to nothing. And timely, honest communication is a BIG problem right from the (re)start.


Ralf

PS: @Tigerlily Again, PLEASE, stop this moderation nonsense, it didn't make any sense months ago, it makes even less sense now.


Exactly this in Bold above. Someone (luckily they're gone now) at IBM got butthurt with people talking the truth and put us in moderation...it is nonsense anymore!
----------------------------------------
[Aug 9, 2023 12:05:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12436
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-07-31 Update (MCM1 issue resolved)

As I see it, the hierarchy is:
Jurisica Lab run MCM
They are part of the Krembil Research Institute
Krembil is a part of a network of hospitals (UHN).
UHN would be the main fund raiser but those funds could be earmarked for different parts of the network.

I presume that Krembil wanted to save MCM when IBM pulled the plug so persuaded UHN to take on WCG. I would assume that UHN would have wanted to get extra grants/sponsorship to fund WCG. These take time to get in place,

At least WCG is still alive thanks to them. Hopefully more funds will become available so WCG can flourish.

Mike
[Aug 9, 2023 8:37:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1957
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-07-31 Update (MCM1 issue resolved)

And how can they seriously expect to find any one willing to put up money for the project if they are so lack luster and dishonest in their communication? If they are communicating in the first place! This is something that costs very little to nothing. And timely, honest communication is a BIG problem right from the (re)start.


Ralf

PS: @Tigerlily Again, PLEASE, stop this moderation nonsense, it didn't make any sense months ago, it makes even less sense now.


Exactly this in Bold above. Someone (luckily they're gone now) at IBM got butthurt with people talking the truth and put us in moderation...it is nonsense anymore!
Well, that note, was as I tried to make clear, directed at TigerLily directly, as the person who is moderating the forum.
And it is not about "moderation" of the whole forum or anyone at IBM (never had a problem to speak my mind when the project was under IBM, we all could just hope to have people like knreed and uplinger back on WCG Towers side).
It is rather about the fact that Cyclops got butt hurt a few months ago now, for calling him out about his lackluster response to problems, specially as two snowflakes among the members of the forum complained and reported such "insubordination" (and one of those, I had never seen before and have not seen since). Since then all my posts have to be released by the moderator, which while Cyclops was still "in charge", was at least within a day mostly. But now that TigerLily is running the show, things got even worse. Not only does it now take several days at times until my posts would be appearing on the forum, it is quite obvious that those are not even being read when I tried to point out impending problems. Several times now I had posted about problems with the web site/servers very early, before a lot of other people (and WCG obviously) take this serious. This has been commonly DAYS before there was any kind of response from WCG acknowledging any problem, each and every time after a post from some other forum member.
And now TigerLily has decided not only have my posts show up days late, no, anything critical to their absolutely lackluster communication performance, she is actively suppressing/censoring those posts by not letting them be posted on the forum. Well, for which she apparently had to read them. But then when asked to please stop this nonsense, those requests, directly addressed, are just being posted verbatim instead of taking note and removed.

This shows clearly the level of hypocrisy that seems to have taken hold at WCG Towers. On my part, this is the only way to reach someone at WCG, as their support email isn't working, since end of last year now! And I wouldn't be surprised that this general attitude from WCG Towers is rather detrimental to find anyone (person or organization) to provide the needed funds to improve the technical situation with the project. Who wants to put up money if this just seems to disappear into a big black whole?


Ralf
----------------------------------------

[Aug 10, 2023 2:41:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2209
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-07-31 Update (MCM1 issue resolved)

How about fixing this MCM1, that's been waiting to be sent, for way too long:

MCM1_0202246_3961
[Aug 10, 2023 2:46:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2209
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-07-31 Update (MCM1 issue resolved)

If the WCG team would tell us, why last weekend was totally dry of "new" work, and that only resends were available, it would leave no room for speculations. However, not a word so far this week, about that.

Tomorrow is Friday again. Let's see if my suspicions about how they handle sending out "new" tasks during weekends proves correct.
----------------------------------------
[Edit 1 times, last edit by Grumpy Swede at Aug 10, 2023 4:28:06 PM]
[Aug 10, 2023 4:25:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Speedy51
Veteran Cruncher
New Zealand
Joined: Nov 4, 2005
Post Count: 1297
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-07-31 Update (MCM1 issue resolved)


Tomorrow is Friday again. Let's see if my suspicions about how they handle sending out "new" tasks during weekends proves correct.

Could I suggest you start a thread in the chat room if you would like to talk about "your suspicions" since this thread says "(MCM1 issue resolved)"

Where I live it is Friday already :-)
----------------------------------------

[Aug 10, 2023 9:31:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 74   Pages: 8   [ Previous Page | 1 2 3 4 5 6 7 8 | Next Page ]
[ Jump to Last Post ]
Post new Thread