Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Member(s) browsing this thread: Speedy51 |
Thread Status: Active Total posts in this thread: 3268
|
![]() |
Author |
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12397 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thank you, Sgt. Joe
That is a tenth unstuck unit identified. Only 50 to go! Mike |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12397 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thank you, Crystal Pellet.
That unstuck one is going well - 2 generations in 2 days. Mike |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12397 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The leading ultra is now on the tail of the last stuck unit.
Mike |
||
|
geophi
Advanced Cruncher U.S. Joined: Sep 3, 2007 Post Count: 104 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Just picked up a 071 ultra
ARP1_0010090_071_0 |
||
|
Hype
Cruncher Germany Joined: Nov 18, 2011 Post Count: 43 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
RAM is also a factor. I can run all 8 threads with ARP because it has 20 GB RAM, albeit taking an extra third more time than if I run 4 threads with ARP. The problem with a large number of threads is that the checkpoints become more frequent. With 16 ARP threads out of 32, you are likely to have checkpoints every 5 minutes, say, if they are well spread. As they don't take the same amount of time, some checkpoints clash and clog up the machine. And if more than 2 clash you have a bigger problem. Mike I have 32 GB of RAM, so that should be fine I guess. I did some testing with OPN1 and ARP. I compared OPN1 WU runtimes between running 32 OPN1 WUs, then 24 OPN1 and 8 ARP, and 28 OPN1 and 4 ARP. With 24 OPN1 and 8 ARP, OPN1 WUs on average are 18% slower. With 28 OPN1 and 4 ARP, OPN1 WUs on average are 10% slower. I might do more testing with more ARP, but is this normal and to be expected? ![]() |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7670 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I might do more testing with more ARP, but is this normal and to be expected? Yes, there are also other bottlenecks in addition to memory bottlenecks. See Amdahl's Law Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12397 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
geophi
That is a known ultra and has moved 3 generations in only 2 days. Mike |
||
|
geophi
Advanced Cruncher U.S. Joined: Sep 3, 2007 Post Count: 104 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Just picked up a 092 unstuck
----------------------------------------ARP1_0033316_092 Edit...this one is definitely running slower than the other unstuck units this PC has run. It's early but it looks like it could take 14 or 15 hours compared to the 9-10 my other unstuck tasks have taken. It's running on the 64 bit executable. [Edit 1 times, last edit by geophi at Jan 8, 2022 9:32:50 PM] |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 970 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Another triplet... ARP1_0033633_096
This one is using the 64-bit application on my Linux Ryzen and is a bit more sluggish than usual, so I had a look at the file namelist.input in its slots directory -- the parameter I presume to be important here is time_step=24... The other two ARP1 tasks running on that system at the same time both have the value 36 for that parameter (which tallies with your earlier post in this thread about the default time step being 36 seconds) If that time-step change is typical for these unstuck cases, that would suggest about a 50% increase in run-time provided it runs on the same application version as usual. It might be interesting if other folks could delve into that file for apparent stragglers and "awkward" tasks to see if they have any value other than 24 or 36 for that parameter :-) Cheers - Al. |
||
|
geophi
Advanced Cruncher U.S. Joined: Sep 3, 2007 Post Count: 104 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My formerly stuck unit has 24 for time_step as well.
|
||
|
|
![]() |