Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 38
|
![]() |
Author |
|
Speedy51
Veteran Cruncher New Zealand Joined: Nov 4, 2005 Post Count: 1292 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
If APR work can be processed adequately on a host I do not see any reason work not to be processed stop. Choice is completely up to the individual who owns the hardware APR is running on
----------------------------------------![]() |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2161 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
- there are no break point: I had to reboot with all tasks having 20 hours of calculation and most of them restarted from 0 (but a few ones restarted with a few hours of calculation, very strange) You should know that ARP1-tasks have only 8 checkpoints, divided evenly across their run, that is each 12.500%. Some devices are even slower than yours: ARP1_0030988_140_3 Linux Ubuntu Valid 2024-11-06T14:54:43 2024-11-11T06:46:56 49.45/51.46 817.7/652.8 In the ARP1-run above you can see that this device needed more than 12 hours to reach the task's 7th checkpoint. If you know you must reboot, better do it right after reaching a checkpoint or pause your task right after reaching a checkpoint. Adri |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1951 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
This combined with the previous issues (status too late, download terror) = I'm done for now with ARP. ARP has steep requirements for a reason i.e. don't try to run it on a machine with inadequate resources. I think you are right in giving up on ARP. 36 hours for an ARP unit indicates although your machine can adequately process ARP units, it is probably marginal at best. And if he has 15 ARP1 WUs to terminate, on one host, that means he is just one of those hoarders and doesn't understand why those restrictions have been put in place. It's 1GB of RAM per ARP1 WU. so if he has 15 of them to terminate, that means those alone take up 15GB of RAM. Add another couple of GB for the OS itself and consider that by general rule of thumb, your RAM usage should never exceed 80% of your physical RAM installed, this would mean that he would have to have at least 20GB of RAM in the system. Anything less than that, the system will start swapping excessively to the disk (maybe less noticeable immediately on an SSD system), but this will drive up the time it take to process each single WUs and thus the (clock) time between checkpoints, which are as you mentioned, fixed at every 12.5%. Given that he even noticed this WU restart means the he's is resetting the processing constantly (laptop moving around, sleep settings?), thus never passing that first 12.5% checkpoint. So all his complaints seem completely self-induced... Ralf ![]() |
||
|
catchercradle
Advanced Cruncher Joined: Jan 16, 2009 Post Count: 127 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
1GB RAM/core is not exactly high end these days. I can happily run 12 or more concurrently and still be only using 20% of memory on my machine. That said, in testing for other projects I have ran tasks that get through 8GB peak on each task.
----------------------------------------The other thing that slows down ARP tasks is they hammer the level3 cache on the CPU. I find maximum throughput on my machine is 15 out 16 real cores. Using virtual cores does not help throughput on these tasks! Edit: Just been reading the whole thread. Lots of work in VM now coming to an end so I will gradually ramp up the number of tasks in native Linux client as it seems the problems were machine related and others are completing tasks in Linux OK. [Edit 1 times, last edit by catchercradle at Nov 11, 2024 5:09:09 PM] |
||
|
gj82854
Advanced Cruncher Joined: Sep 26, 2022 Post Count: 104 Status: Offline Project Badges: ![]() ![]() |
I'm running 32 ARP1 WUs concurrently on one host and it is using 23.444 GB memory (less than 50% of the memory). Total run time per WU is about 10.5 to 11 hours
|
||
|
Boca Raton Community HS
Advanced Cruncher Joined: Aug 27, 2021 Post Count: 126 Status: Offline Project Badges: ![]() ![]() ![]() ![]() |
I'm running 32 ARP1 WUs concurrently on one host and it is using 23.444 GB memory (less than 50% of the memory). Total run time per WU is about 10.5 to 11 hours And here we are- happy when the files to run 1 ARP1 work unit is a cause of celebration on our end.... |
||
|
gj82854
Advanced Cruncher Joined: Sep 26, 2022 Post Count: 104 Status: Offline Project Badges: ![]() ![]() |
I'm running 32 ARP1 WUs concurrently on one host and it is using 23.444 GB memory (less than 50% of the memory). Total run time per WU is about 10.5 to 11 hours And here we are- happy when the files to run 1 ARP1 work unit is a cause of celebration on our end.... I downloaded 48 ARP1 tasks in about 3 hours this morning. It's hard for me to believe they have given me priority to the download queue. I would like to think it is a FIFO queue. |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12376 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Once the empty cache have filled the demand slows down. Everyone was totally empty this time so it has taken a while but they should be able to cope with the demand now we are about full.
Mike |
||
|
[AF>Le_Pommier] Jerome_C2005
Cruncher Joined: Aug 17, 2006 Post Count: 29 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
This combined with the previous issues (status too late, download terror) = I'm done for now with ARP. ARP has steep requirements for a reason i.e. don't try to run it on a machine with inadequate resources. I think you are right in giving up on ARP. 36 hours for an ARP unit indicates although your machine can adequately process ARP units, it is probably marginal at best. The processing speed of a host for a single ARP1 WU is not the issue. The real problem is what our friend isn't explicitly stating. That he is likely trying to run multiple WUs at once, in defiance of the default restrictions set when the ARP project was first introduced years ago. Looking at his badges, I think it is safe to assume (as far as WCG is concerned) that he likely isn't running more than one host to crunch on WCG projects. And if he has 15 ARP1 WUs to terminate, on one host, that means he is just one of those hoarders and doesn't understand why those restrictions have been put in place. It's 1GB of RAM per ARP1 WU. so if he has 15 of them to terminate, that means those alone take up 15GB of RAM. Add another couple of GB for the OS itself and consider that by general rule of thumb, your RAM usage should never exceed 80% of your physical RAM installed, this would mean that he would have to have at least 20GB of RAM in the system. Anything less than that, the system will start swapping excessively to the disk (maybe less noticeable immediately on an SSD system), but this will drive up the time it take to process each single WUs and thus the (clock) time between checkpoints, which are as you mentioned, fixed at every 12.5%. Given that he even noticed this WU restart means the he's is resetting the processing constantly (laptop moving around, sleep settings?), thus never passing that first 12.5% checkpoint. So all his complaints seem completely self-induced... Ralf I have an i9 with 20 threads and 40 GB of RAM, it's a fix computer, Sherlock. I had to reboot it (once) when most tasks had over 20 hours calculating, and most of them restarted from 0. So the "breakpoints" were just implemented with 2 left feet. Anyway I stopped trying, hoarder. |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7666 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have an i9 with 20 threads and 40 GB of RAM, it's a fix computer Given the specs on your computer, their really should be no way the ARP work units should take 36 hours. If you are running all twenty threads with ARP units, I would suspect your computer is doing some self throttling due to heat issues. If that is not the case, you must have some other bottleneck in the processing stream someplace. I have an I7-7700 with 8gb RAM and ARP units take 18-20 hours. I only run 2 threads on ARP, the rest are MCM. I am curious to know what a "fix" computer is. ![]() Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|
![]() |