Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 9
|
![]() |
Author |
|
Thanassos
Cruncher Joined: Jun 21, 2013 Post Count: 24 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi Guys,
----------------------------------------I've decided to run CEP2 WUs again as I've installed enough RAM (256GB) to use it as a RAM Disk and avoid the read/writes to my SSD. I've noticed however that the units do not use 100% of the CPU like the rest of them. Time elapsed is 20-30% higher than CPU Time, and with 64 CEP2 units CPU Usage as a whole won't go above 70-75% ~ The hardware in this rig is 4x Intel Xeon E5-4650s @ 2.7Ghz (32 Core/64 Threads) with 32x 8GB Sticks of 1066Mhz ECC/Registered DRAM. I'm assuming maybe it's because CEP2 uses more of the CPU so the HT can't work at maximum resulting in lost CPU time between the Physical Core / HT Core. My Quad AMD rig runs at 100% just fine. However any advice would be fantastic, as I'd rather be running @ 100% than losing time on this project. Thanks. ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Since on Windows 7 running multiple concurrent, with HT on and using standard HD gives me >95 % efficiency, there is likely somthing else causing your greater CPU / Elapsed differential. The biggest losses incurred is during simultaneous startups and checkpointing, once running for longer the tasks running asynchronous, starting with offset, efficiency tends to improve. A dev.ticket is outstanding to automate staggered starting for CEP2 and other sciences that have large I/O activity, some manually controlling the starting one at the time.
----------------------------------------[Edit 1 times, last edit by Former Member at Jan 19, 2015 7:17:55 AM] |
||
|
Jim1348
Veteran Cruncher USA Joined: Jul 13, 2009 Post Count: 1066 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I've noticed however that the units do not use 100% of the CPU like the rest of them. Time elapsed is 20-30% higher than CPU Time, and with 64 CEP2 units CPU Usage as a whole won't go above 70-75% ~ ........... I'm assuming maybe it's because CEP2 uses more of the CPU so the HT can't work at maximum resulting in lost CPU time between the Physical Core / HT Core. My Quad AMD rig runs at 100% just fine. CEP2 has huge I/O demands, so it is not just a hyperthreading issue. With 64 cores, there is probably a bottleneck somewhere, though I am not familiar enough with the multi-CPU motherboards to know where it is. At least your RAM disk has eliminated the disk drive as a potential problem, but getting to it is another thing. [Edit 2 times, last edit by Jim1348 at Jan 19, 2015 10:03:59 AM] |
||
|
KLiK
Master Cruncher Croatia Joined: Nov 13, 2006 Post Count: 3108 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Maybe you can experiment a little with this, just up a number a little more for your 64cores:
----------------------------------------https://secure.worldcommunitygrid.org/forums/...ead,37629_offset,0#481389 Probably the bottle-neck is a RAM! So you can get 60-70% od CPU on 100% with CPE2 & others can cruch other projects on WCG...so your rig can deploy it's 100% od CPU time! ;) |
||
|
Thanassos
Cruncher Joined: Jun 21, 2013 Post Count: 24 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for the replies guys, I tried staggered starts and it doesn't seem to solve anything when running 32 or more CEP2 Units.
----------------------------------------If RAM is the bottle neck there's not a lot I can do :( Can't get faster ECC DIMMs in 8GB without taking out an epic loan. KLik thanks for that link, I'll try setting that up with 16 max of each project and that'd be amazing. ![]() |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7668 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I agree with Jim1348. Your problem is the I/O. I think you do not have enough internal bus width to handle the data transfer you are having running 64 CEP2 jobs. I think you are on the right track with splitting your jobs to 16 Cep2 and 16 something else.
----------------------------------------Cheers
Sgt. Joe
----------------------------------------*Minnesota Crunchers* [Edit 1 times, last edit by Sgt.Joe at Jan 20, 2015 4:36:14 AM] |
||
|
KLiK
Master Cruncher Croatia Joined: Nov 13, 2006 Post Count: 3108 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for the replies guys, I tried staggered starts and it doesn't seem to solve anything when running 32 or more CEP2 Units. If RAM is the bottle neck there's not a lot I can do :( Can't get faster ECC DIMMs in 8GB without taking out an epic loan. KLik thanks for that link, I'll try setting that up with 16 max of each project and that'd be amazing. Be carefull with that...I've tested the app_config & this morning only 3 of 4 cores were working...ran out of work for other projects, only MCM was running! So put MCM on 64...only cut down the CEP2 project on 32! ;) |
||
|
ryan222h
Senior Cruncher Joined: Sep 4, 2006 Post Count: 425 Status: Offline |
I'm assuming maybe it's because CEP2 uses more of the CPU so the HT can't work at maximum resulting in lost CPU time between the Physical Core / HT Core. My Quad AMD rig runs at 100% just fine. However any advice would be fantastic, as I'd rather be running @ 100% than losing time on this project. Thanks. Must be a Hyperthread issue. Have you tried disabling HT, at least just to see if that is the issue? My AMD system is also running 95% efficiency on a single Intel SSD with 24 CEP2 cores. Quad channel RAM 16gb total system RAM. https://www.youtube.com/watch?v=2VgEhxCB1ek ![]() |
||
|
Thanassos
Cruncher Joined: Jun 21, 2013 Post Count: 24 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
If I disable HT and use only 32 Cores then there is no issue, but I get less points/hour overall.
----------------------------------------I've decided to just run a max of 32 CEP2 Units maximum at a time following KLiKs advice. Got all projects ticked again though anyway, so it rarely gets 32 CEP2s at once. Like you though, my Quad G34 6174 rig can get close to 100% when running on all physical cores. But it is much slower than the Xeons. Thanks for all the advice guys. ![]() |
||
|
|
![]() |