Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
Member(s) browsing this thread: bicotz
Thread Status: Active
Total posts in this thread: 3321
Posts: 3321   Pages: 333   [ Previous Page | 271 272 273 274 275 276 277 278 279 280 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3323390 times and has 3320 replies Next Thread
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 986
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Adri, Mike;

As Mike says, it's fairly obvious that there's a tag associated with each cell, and state.txt seems to use that tag rather than matching the cell's generation against the limits. And, as we have noted in the past, a marked discrepancy has developed between the state.txt view of counts per category and the counts to be derived from generations.txt :-)

The max_generation in completed.txt will always be the highest of that category to have reported a move [according to generations.txt], and it appears to be based on querying the same data used to build the generations report (using current category boundaries) rather thanpaying any attention to category tags. Also, the counts in completed.txt seem to track generations.txt fairly well once one remembers that units are usually visible to the counter for more than 24 hours, and it's effectively a two-day moving average for the completion times :-)

Those differences (along with the lag in state.txt counts) can lead to some interesting oddities -- I've seen at least on case where the number of Accelerated tasks reported as completed over two days exceeded the number of tasks that state.txt said existed in that category :-)

I'm currently going through all my notes and collected data to see whether there have ever been major discrepancies between the completions counts and generations.txt (haven't found any huge mismatches yet) and to see if there are any "hints" as to why re-classification of tasks away from Normal seems to have been so slow for so long. Regarding the latter, I haven't yet got as far as the point where state.txt lost over 6000 units (1st November 2022), let alone when it re-found all bar 2 of them (19th May 2023). If there's anything odd that can be detected, I suspect it's in that interval somewhere, as recategorizations seem to have picked up a bit in the last few days!

Curiosity is quite time-consuming, isn't it :-)

Cheers - Al.
[May 30, 2023 2:31:25 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12439
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Sunday Report (late)

Only 601 units have been validated this week.

Assuming that a full generation 182 will be the last, there are 1,599,413 units still outstanding. Based on the last 5 weeks, my forecast end date is now in 2027, but, based on this last week, it would be in 2074.

All definitions are unmoved.

There are now 401 Extremes and 83 Accelerated units listed. The numbers in their generations are now 3,747 & 2,532.

Mike
[May 30, 2023 12:31:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 986
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Firstly:

Mike, thanks for the report. They really do need to do something to reduce the number of tasks that end up with deadline-related retries, don't they :-) Keeping stuff out of PVal jail would certainly help throughput, but it's not the whole solution! Shorter deadlines for Normal tasks, perhaps?

And it doesn't look as if the generation classification system is working to help them at present, does it? I wish there was a re-tagger to force cells into the appropriate category, but I guess there isn't...

Secondly, further to my overnight comment:

Today's generations file contains a single item going from generation 128 to generation 129 -- it ought to be an Extreme but it has been counted as Normal in the completed.txt file! If the bounds were still at pre-October values this might make sense, so that one probably hasn't moved in a while and the "completed" counts might be tag-based after all...

The state.txt file now sees one more Extreme and one less Normal -- I wonder if that is this one being reclassified? (I haven't seen any evidence of more new work at the time of writing this...)

I really do need to try to finish up that "inspection" I mentioned above; it's beginning to look as if I will find some huge discrepancies in completed counts once I start looking at the data from November 1st onwards, and the reason I wasn't spotting discrepancies recently was more to do with the vast majority of the tasks being genuine Normals anyway! -- I'm glad I said "appears" and "seems to", because today's singleton acts as a counter-example (I genuinely wasn't 100% convinced, which is why I was going to keep digging, and will continue[1]!)

Cheers - Al.

[1] I'm confused (which I don't like!)... I want to know what it's doing, and no-one can/will tell me/us :-)
[May 30, 2023 2:02:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 997
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

The different categories and shorter deadlines are meaningless with the big pauses between running the next one.

Thanks for the update Mike.
[May 30, 2023 3:02:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12439
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Al

This last batch from 18 May does not seem to have so many re-sends, so maybe your first prayer has been answered to some extent. PV jail seems to be haphazard so may be it depends on workloads or staff levels. Shorter deadlines would seem to be unnecessary with long spells between batches.

The classification issue I have said several times now that I believe it relates to when that grid square was previously run rather than when it is now being run. Some had not been run for over a year which would have thrown the classification out if I am correct.

Running all 35,609 grid squares would, I believe, solve the issue. However, they would need to be fed to us slowly so as not to clog up the system. As you say, there has been some pick up recently, probably due to the 26,308 units validated over a week ago.

Mike
[May 30, 2023 5:15:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 812
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Where is the work? At this rate that poor graduate student won't get the data she needs to ever graduate if she depends on the timely completion of this project.

There really should be a constant flow of work units. It'd be great to finish this project within the next few months if several thousand work units can be reliably churned through and validated per day.

I miss the IBM WCG monthly updates.
----------------------------------------
  • i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

[Jun 2, 2023 1:57:46 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12439
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Sunday Report

Only 10 units have been validated this week.

Assuming that a full generation 182 will be the last, there are 1,599,403 units still outstanding. Based on the last 5 weeks, my forecast end date is now in 2027, but, based on this last week, it would be in 5090!!!!

All definitions are unmoved.

There are now 403 Extremes and 83 Accelerated units listed. The numbers in their generations are now 3,747 & 2,532.

Mike
[Jun 5, 2023 12:04:23 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 997
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

I wanted to make sure everyone saw this post on June 2 about how ARP is still having space issues. https://www.worldcommunitygrid.org/forums/wcg...ad,45380_offset,20#686862
[Jun 5, 2023 5:02:22 AM]   Link   Report threatening or abusive post: please login first  Go to top 
pwhidden
Cruncher
USA
Joined: Nov 17, 2004
Post Count: 32
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Mike,

Thanks for your Sunday update.

The reason I am so interested in this project is that in 1972-73 I was a volunteer with an international aid organization trying to address the issues of food insecurity in Africa. I saw thousands of children with kwashiokor and marasmus. The death toll is quite staggering but only part of the problem, especially in children. Stunted growth in those that survive are a symptom, but poor nutrition in children affect brain development that will last for the rest of their lives. Many children don't technically die from malnutrition, but simple things such as diarrhea or malaria which take their toll on weak bodies. One of the programs we ran was a school lunch program where children would receive one meal a day at school. When our food supply was interrupted, it was quite typical that a school's attendance would drop in half within 2 weeks.

The issues we faced 50 years ago are still there today. In Africa, one in five people faced hunger in 2020. One third of the continent's population was undernourished. 12.8 million children in East Africa alone were acutely malnourished. In March of this year, WHO appealed for $178 million in additional assistance for the 7 countries in the Greater Horn region.

So I am very passionate about this project. I realize that at the end, the results may not give better rain forecasting, but it is well worth the risk to help solve one of the basic problems on the continent... and save millions of lives.

I know that most if not all of the people volunteering their excess computer time to this project are also passionate about it. You can see it in the posts to this thread. And it was great to have the technical guys at IBM working with us to fix problems, keep us informed, and keep us excited about this project.

The technical problems that we now have.... bandwidth, disk space, tape transfer... are really simple problems. They are faced and solved in thousands of data centers around the world every day.

With 10 units validated last week and 1,599,403 left to complete, the word that comes to mind is "unacceptable". Is there any other word that would describe an end date of 5090? Someone in Africa dies of malnutrition every 48 seconds. Even if we assume an end date of 2027. That is 7-8 years from project start to simulate 1 year of weather.

The final term that comes to mind to describe this project is "lack of urgency". Did I mention that someone in Africa dies of malnutrition every 48 seconds? We don't have enough information to know if the problem is with WCG or Delft. Perhaps both. There could be valid reasons for that. Perhaps the lead researcher died? Are they having trouble getting students to sign up for a project that my last for somewhere between 8 and 3000 years to complete? Do they know enough from what has been completed to know that this project will have only marginal impact? Are people involved too over committed to other more important research projects?

Did I mention that someone in Africa dies of malnutrition every 48 seconds? Did I mention that I am frustrated with this project? Does anyone care?
----------------------------------------

[Jun 5, 2023 7:26:45 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 986
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

The technical problems that we now have.... bandwidth, disk space, tape transfer... are really simple problems. They are faced and solved in thousands of data centers around the world every day.
Yes, those problems are solved every day in places where there are enough financial [and other] resources to do so.

Unfortunately, that's not currently the case at the new home of WCG -- there are plenty of posts elsewhere on the forum about why there isn't 24/7 support and about trying to attract extra funding (sponsorship and so on...) but for now the situation is far from perfect. That said, at least WCG didn't vanish when IBM wanted rid of it...

As for ARP specifically -- as Unixchick reminded us a few posts ago, there are still storage issues -- given that WCG does not have infinite resources, they can't just keep shovelling out work and hanging on to many Terabytes of data[1] for weeks/months until they can eventually offload them to the scientists...

You're right -- the volunteers care, but I suspect most of us realize that there isn't a magic money tree either here or at Delft, so we process what we can and hope for improvement. On which note, if you happen to know of any potential sponsors, I'm sure Dr. Jurisica and his colleagues would be delighted to hear from you :-)

Cheers - Al.

[1] I'm not sure exactly how much data is involved, but If one copy of each of the large files volunteers send back to WCG for a grid cell has to go to Delft, one complete generation of data is likely to be counted in Terabytes :-)
[Jun 5, 2023 4:14:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 3321   Pages: 333   [ Previous Page | 271 272 273 274 275 276 277 278 279 280 | Next Page ]
[ Jump to Last Post ]
Post new Thread