Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 100
|
![]() |
Author |
|
wolfman1360
Senior Cruncher Canada Joined: Jan 17, 2016 Post Count: 176 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi,
----------------------------------------Should I stop crunching this on disks that are mechanical ie. hard drives and keep letting it run on ssd's? I want to crunch as much as possible but also want my CPU to be computing as much as can be at a given time to help the project. Also getting quite a few errors on this particular project.
Crunching for the betterment of human kind and the canines who will always be our best friends.
AWOU! |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2167 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Tony, have you tried the
----------------------------------------codetag for useful formatting of the 'pidstat' command output? [Edit 1 times, last edit by adriverhoef at Oct 18, 2017 1:11:20 PM] |
||
|
TonyEllis
Senior Cruncher Australia Joined: Jul 9, 2008 Post Count: 261 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks adriverhoef - that fixed the formatting... sorry for the delay in thanking you - suffer from the "You don't have permission to access /forums/wcg/addpostprocess on this server." problem...
----------------------------------------wolfman1360 - don't seem to have the 'slow-down' other report when running all mip1 - even on the i7 with 8 CPUs, so don't know that this will help. Most of my machines have a single SATA drive with platters. The machine here most likely to slowdown would be one i7 - but it doesn't. This Intel i7 has conventional spinning disk drives for the wcg 'work' files, WD 'Red' NAS drives. However, the partition wcg uses is a RAID1 with two drives... this has the benefit of sustaining two data transfers simultaneously. [root@danda ~]# hdparm -t /dev/md2 # one transfer /dev/md2: Timing buffered disk reads: 408 MB in 3.00 seconds = 136.00 MB/sec [root@danda ~]# hdparm -t /dev/md2 & hdparm -t /dev/md2 # 2 transfers [2] 26647 /dev/md2: /dev/md2: Timing buffered disk reads: Timing buffered disk reads: 404 MB in 3.00 seconds = 134.63 MB/sec 414 MB in 3.00 seconds = 137.78 MB/sec [1] Done hdparm -t /dev/md2 [2]+ Done hdparm -t /dev/md2 [root@danda ~]# cat /proc/mdstat | grep -A 3 'md2' md2 : active raid1 sdd2[1] sdc2[0] 10485696 blocks [2/2] [UU] bitmap: 2/160 pages [8KB], 32KB chunk As you can see - two reads at once - no slowdown... maybe that helps.
Run Time Stats https://grassmere-productions.no-ip.biz/
|
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2167 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Tony, it may be the wrong subforum for me here, it's just that I saw and clicked your current signature 'Current Targets' and my eyes fell on these two links:
----------------------------------------World Community Grid Projects World Community Grid Progress It looks as if both have their underlying URLs swapped: World Community Grid Projects = www.sraellis.tk/altern.php?number=17&monitor=wcg_progress&alternative=26 World Community Grid Progress = www.sraellis.tk/altern.php?number=17&monitor=wcg_projects&alternative=26 The naming is a bit odd … [Edit 3 times, last edit by adriverhoef at Oct 19, 2017 9:05:32 AM] |
||
|
TonyEllis
Senior Cruncher Australia Joined: Jul 9, 2008 Post Count: 261 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks adriverhoef
----------------------------------------Was a bad decision when the original html was created - and forgotten about. Am about 80% the way through a conversion from .html to .php with variables and realized that anomaly had never been fixed, added to the todo list. Since you noticed - pushed it to the top and now completed - completely renamed both...
Run Time Stats https://grassmere-productions.no-ip.biz/
|
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 971 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Tony,
Interesting I/O comparison. I think you'll find that the 2G memory is the reason for the large amount of I/O activity, but perhaps you already knew that. (I have lots of spare RAM and never see disk read figures like that!) I suspect what happens on my machines is that the data file(s) load into the spare memory as a side-effect of the jobs being set up, and because I have so much spare RAM they don't get removed. So all my read operations are effectively from RAM, not disk, once the application starts. pidstat doesn't report non-disk reads, so I actually watched /proc/{PID}/io/rchar (which climbs quite rapidly whilst read_bytes stays on zero!...) Just an information point as part of the ongoing performance discussion... Cheers - Al. |
||
|
dkester788
Cruncher USA Joined: May 3, 2007 Post Count: 44 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My Microbiome jobs seem to use CPU time very sporadically. Typically, I have 8-10 jobs running at one time but the Progress bar seems all over the board. It appears all the jobs don't really run at the same time. Hard to really put into words, but has anybody else noticed similar results with this project?
|
||
|
dkester788
Cruncher USA Joined: May 3, 2007 Post Count: 44 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My WU completion time has increased from 1:15 to almost 4:00. I'm running Windows 7 Pro I7, 3930 with 32GB RAM.
|
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7675 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My impression is that the work units have increased in size recently. Mine have gone from 1.5 hours to about 4.5 hours, starting week or so ago. Other than that, they seem to be processing normally. Given that we are doing many different organisms in the human biome, I would expect that the work units will vary quite a bit over time.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
After monitoring both chips on a machine running 32 threads with the large memory WUs, both chips report L1_Cache_Miss=0, L2_Cache_Miss=0, and L3_Cache_Miss=0. Since there aren't any L3 Cache misses, it's unlikely there are any cross-chip "snoops" happening. The QPI links are very busy; in the range of 15.3GB/s across 2 links. 5 to 1 read to write ratio. Unfortunately, I don't have the QPI counters turned on in the BIOS so can't see QPI details but it looks like the QPI links are being saturated and I assume this would happen on the AMD HyperTransport Links also. Since flow control for the links happens on-chip, the OS doesn't see any wait state so reports the thread as dispatchable. Also explains why chips with fewer threads see less of a problem than chips with high thread counts. Without QPI counters, its only a guess. Could try and configure the machine for non-NUMA and see if that helps but I suspect it would only be a minor benefit.
|
||
|
|
![]() |