Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 14
|
![]() |
Author |
|
Boca Raton Community HS
Advanced Cruncher Joined: Aug 27, 2021 Post Count: 136 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() |
I think I found one of the bottlenecks and it might have to do with the memory. I did not realize the memory was "unbalanced" due to the amount of DIMMs in use. I took some of the memory out so it is now balanced. Waiting to see if there is any change. There are basically no hard faults anymore for any processes. Here is where it is at, as of right now:
Mapping CM: ~ 2:15 Open Pan: Anywhere from 1:45-3:00 Not enough time to know if it helped with ARP. Also running Einstein @Home for GPU, and those seem to be processing at an expected speed for dual RTX A6000s. Also, we completed the install of the new powerline (208v) which seemed to have actually sped the processing up a few tenths of a GHz based on what I am seeing in the Task Manager. Not sure if this is because of removing a few DIMMs or the 208v line... |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm running a Dell R7425 server with 2 of the AMD EPYC processors (64 cores / 128 threads) and 256GB memory. My setup isn't exactly like yours but somewhat similar. If I run 128 ARP work units simultaneously they will take close to 26 hours (some a little more and some a little less) to complete. What I see is a lot of hardware interrupts (almost 4% to 5% of the system) probably due to context switching. I worked with Dell technical support and we discussed putting the server into HPC mode but we both came to the conclusion it wouldn't make a big noticeable difference. There is also an issue with L3 cache becoming saturated and having to wait for data to come in from external memory. This manifests in the CPUs having to wait for data but it doesn't show up in the OS stats due to the way the OS determines CPU wait time. Others here have noticed it because their systems seemed to run cooler than normal at 100% utilization. Since you have an Intel processor, Intel has a tool that can be downloaded to monitor for L3 cache misses and it will show how much wait time is attributable to transferring data across the memory link from the other processor if you are motivated to look that deeply into the problem.
|
||
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1679 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi Boca Raton Community HS,
----------------------------------------I agree with entity and Sgt.Joe, you suffer definitively under a too limited L3 cache size. At the beginning of the ARP1 project, I ran my most systems as ARP1 only and I quickly noticed that the CPU temperature did not really correlate with the CPU load. After a couple of investigations, I observed the high rate of cache misses. After this first made experience and some discussions with other contributors, we came to the overall recommendation to never run more than (threads/2)-1 ARP1 WUs concurrently one a CPU. In your case, because you have plenty of threads available and relatively limited L3 cache (38 MB is not as much since you shall consider the average L3 cache size / thread), you should be more restrictive with the max number of concurrently computed ARP1 WUs. You can adjust the setting using the app_config.xml file. After some tries, you will surely find the most effective mix between projects (OPN1, MCM1, ARP1) for your system. Additionally, I can only support Adri's statement regarding the operating system choice. Linux is not as complicated and it opens broader perspectives for pupils and students ... and teachers than Windows will ever do. As teacher, it is your role to motivate your pupils and students to discover the World and to be able to become autonomous. Yesterday, I presented my job (I am an automation engineer) to 13/14 years old pupils. I brought two Raspberry Pi just for illustrating my purpose than it is possible to achieve a lot with "limited" resources. 3 of the pupils mentioned that they already experience with the RPi (at least because one's father is an automation engineer as well) and that they have a lot of fun with it. I really think that it would be a good idea to leverage projects like WCG for motivating your pupils to think "outside the box" and to be willing to discover new horizons. A computer system is not simply a black box just requiring more CPU performance, more RAM, more disk space for running better. A computer system needs to be understood. Finally it is nothing more than providing some background knowledge to make the people capable to analyse, to understand ... and to solve the problems on their own. A lot of people and politicians claim for more sustainability, but at the same time they still support a "single use" electronic environment, considering that learning to well use the available resources is less important than buying a newer, more "modern" and more performant equipment. We see the damages caused by such behaviours with mobile phones, with PC, with IT infrastructure, with cars, etc. We really need that people, in particular young people, are able to better master the resources they use. It needs to be a little bit knowledgeable and open for learning for being able to apply critical thinking and not just being a "follower" only. I wish you a lot of success. Cheers, Yves |
||
|
nyanthiss
Cruncher Joined: Nov 23, 2012 Post Count: 15 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hello, Boca Raton Community HS,
----------------------------------------I myself have been tinkering in my free time, trying to get the most performance out of my bunch of random machines running BOINC :) There are quite a few tools to answer questions like "is my CPU wasting time waiting on memory", and there are also a few knobs you can turn to make even ARP slightly (a few %) faster. You can also learn a lot in the process. But i can only help you with this on Linux, i'm completely clueless in the Windows world.
Intel Xeon E3-1231 v3
AMD A10 7800 AMD Ryzen 5 3500U AMD Ryzen 1700X AMD Ryzen 5900X 2x RaspberryPi, 1x Odroid |
||
|
|
![]() |