Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 100
Posts: 100   Pages: 10   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 17897 times and has 99 replies Next Thread
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1322
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU Characteristics - Performance problems - Feedback from scientists is expected

The memory drop seems to be only once after the restart until the next checkpoint.
Name MIP1_00004948_0463_0
Virtual memory size - Working set size
363.68 MB 264.88 MB in running state 80% done
157,45 MB 69,79 MB after resume/restart
363,61 MB 264,84 MB in running state after next checkpoint 87% done

[Sep 22, 2017 7:59:14 AM]   Link   Report threatening or abusive post: please login first  Go to top 
armstrdj
Former World Community Grid Tech
Joined: Oct 21, 2004
Post Count: 695
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU Characteristics - Performance problems - Feedback from scientists is expected

Crystal,

The jump in memory use I am seeing is not after a checkpoint, it occurs during the calculation of the structure but thought someone in the thread had said that the memory usage never increases at all after a restart.

Thanks,
armstrdj
[Sep 22, 2017 12:49:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU Characteristics - Performance problems - Feedback from scientists is expected

Mine did but I didn't pay close enough attention to see if they had another checkpoint after the reduction. I can test again and watch for that... All I noticed was they reduced and they were still in the reduced state at end of task.
[Sep 22, 2017 1:19:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU Characteristics - Performance problems - Feedback from scientists is expected

It didn't take long to verify... The WU in question drop from 290MB to 92.44MB after restart and ran that way for about 4 minutes then increased back to 290MB. There was no intervening checkpoint. The WUs i was watching previously were probably close to finishing anyway and therefore stayed that way until they ended. This time purposely pick one that had just started and had achieved one checkpoint. There are some on the same host that dropped and after 15 minutes they are still at the reduced level. Others have increased back to their original amount.

IIRC, the WUs I was watching before were made up of 2 or 3 structures and ran longer. The one I tested this morning seems to have more structures and they run for a shorter amount of time. I guess that means the older WUs were already in the last structure when the reduction took place.
----------------------------------------
[Edit 2 times, last edit by Doneske at Sep 22, 2017 1:44:33 PM]
[Sep 22, 2017 1:34:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU Characteristics - Performance problems - Feedback from scientists is expected

One other comment about the run times. It seems, that the "issue" has an effect on other WUs too. Just briefly looking at a box with HST1 work on it, those WUs used to complete in 16 to 19 hours depending on WU on that machine. They now seem to be in the 20 to 24+ range on that same machine with majority MIP1 work. Just a brief observation. HST1 history for that machine is gone from the DB so can't verify...
[Sep 22, 2017 2:04:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1951
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU Characteristics

I modified the app config file to only run 6 MIP units instead of 24. Runtimes dropped from a little over 5 hours per WU to between 2.5 and 2.8 hours. I was able to get the runtime under 2 hours by continuing to reduce the number of concurrent MIP WUs. Tried the same experiment on an AMD 8 core system (no hyperthreading) and saw the same result just not to the same extent. Runtimes on the AMD dropped from about 3.5 hours to about 2. The more you run concurrent the worse it is. On my 32 core system, the runtimes are up over 7 hours per WU. Something is definitely wrong with these WUs.
Efficiency on all work units remained at 99.9+.
Well, a lot of testing that you have done, but I think there is a very simple reason for that behavior. Just check the properties of a(ny) MIP1 WU, in particular the "working set size" and the "virtual memory size". Multiply this by 8, 24 or 32 and check against your physical RAM in that host, in particular if that machine is "crunching on the side"...
It is not something "wrong" with those WUs, it is simply that MIP1 currently is the most demanding project, resource wise. With 650MB RAM indicated in the system requirements you would need at least 32GB of RAM in a 32 core(thread) box, with just a bare OS running, more likely 64GB or more if that host is doing "real life" work beside crunching for WCG, which pretty much all of my hosts do...

Ralf
----------------------------------------

[Sep 23, 2017 6:32:12 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU Characteristics

If you are referring to swapping, all swap use is 0. That was the first suspect that was checked when the run times went up beyond expectations. There is plenty of available memory.

KiB Mem : 32901384 total, 21042676 free, 7238884 used, 4619824 buff/cache
KiB Swap: 33517564 total, 33517564 free, 0 used. 25185392 avail Mem
[Sep 23, 2017 11:49:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1673
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU Characteristics

Hi TPCBF,
you could maybe right with your remark in some cases. However on my side, 8 WUs + OS use about 8GB of 16 GB RAM. The bad performance is not in relation with some swapping and unavailable RAM bottleneck. My host still has about 8 GB RAM free. Likewise there is a lot of available free space on the concerned partition.
The problem stays in the science code.
Cheers,
Yves
----------------------------------------
----------------------------------------
[Edit 1 times, last edit by KerSamson at Oct 8, 2017 2:06:30 AM]
[Sep 23, 2017 12:31:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1673
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU Characteristics

I made a new try over the last 10 days with MIP1, at this time on an Athlon II x4, Linux Mint 17.03 x64, 4 GB RAM, no swapping. The CPU does not have any hyperthreading capability, i.e. 4 real cores.
The outcome is pathetic:
  • OET1 only: daily average: 4'000+ points
  • MIP1 only: daily average: 1'650 points

Unfortunately, it confirms my initial statement at MIP1 launch last August, regarding the very poor science efficiency; this science is dramatically bad designed.
The performance difference is a factor 2.4, i.e. 60% loss ; we are far away from 10 to 15% decreasing.
How was it possible, that nobody noticed it during the science development, alpha and beta testing?
@MIP1 scientists: please give us some feedback and improve your science!
Cheers,
Yves
----------------------------------------
----------------------------------------
[Edit 1 times, last edit by KerSamson at Oct 9, 2017 2:27:31 AM]
[Oct 8, 2017 2:05:24 AM]   Link   Report threatening or abusive post: please login first  Go to top 
wolfman1360
Senior Cruncher
Canada
Joined: Jan 17, 2016
Post Count: 176
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU Characteristics

Hi,
Currently crunching 2 WUs MIP on an AMD A6-4455M with no turbo core enabled on a samsung laptop.
Current runtimes are looking around 5.1 hours per WU - this seems terribly long even for this processor though I could be mistaken?
How are folks locating each checkpoint, structures etc? Any recommendations on monitoring multiple systems in an easy to use format on windows over lan? Has me curious on my own memory use - normally I just let the crunching commence, but...
----------------------------------------
Crunching for the betterment of human kind and the canines who will always be our best friends.
AWOU!
[Oct 13, 2017 6:28:41 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 100   Pages: 10   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread