Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 23
Posts: 23   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 4072 times and has 22 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Seeing lower than beta19 efficiencies

Left alone machine did 8 of last beta at 98.5 to 99.1 percent efficiency. The first ugm dozen are barely hitting 96-97. Not running the graphics. What could that be, toughness of the varying sequences compared?
[Oct 15, 2014 9:23:03 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Seeing lower than beta19 efficiencies

It's not me nor the sandman. The situation deteriorated, now the compared txt files grown to 13mb and efficiency dropping to 92-93 percent under windows. Per task manager the all exclusive for ugm, largest cpu time competitor was 6 minutes for the system idle process. Switched all cores to mcm and they quickly showed the normal for the node, 99 percent plus. Switched to linux and ran all cores ugm, got 99.8 percent. What's up with that? Have we got another science that favors a particular platform?
[Oct 15, 2014 7:27:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
OldChap
Veteran Cruncher
UK
Joined: Jun 5, 2009
Post Count: 978
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Seeing lower than beta19 efficiencies

First batch on each rig were at ~96% for me on linux. Mid running I changed setting <no_priority_change> to 1 in cc config and it improves cpu%

Came home from work to find everything running in the high 98% or low 99%

Thinking of limiting write to disk some to see if that helps too because even my rig running @ 3.1 is getting low points compared to claim suggesting that over 90% of the results so far are from much faster beasts so in this respect every little helps
----------------------------------------

[Oct 15, 2014 8:36:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
deltavee
Ace Cruncher
Texas Hill Country
Joined: Nov 17, 2004
Post Count: 4890
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Seeing lower than beta19 efficiencies

Have we got another science that favors a particular platform?

Getting .99+ on my windows machines.
[Oct 15, 2014 8:40:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
seippel
Former World Community Grid Tech
Joined: Apr 16, 2009
Post Count: 392
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Seeing lower than beta19 efficiencies

Not sure on the linux/windows question, although there will be some variation between work units which will be especially true between batches. As for the beta vs. production question, there are a few key difference between what was run in beta vs. what's currently being run in production. During beta, we ran a sampling of all work units for the entire project (or at least what the researchers have provided so far). The researchers also requested that we run some reference seqeuences through first. Early indications are that these are reference sequences are generating larger output files (and more IO) than the average for what we saw in beta (which should be more in line with the whole for the project). Starting with batch 45, non-reference sequences will be worked into the mix and we may see the IO being generated (on average) start to drop.

Also, with the exception of a subset of the last beta run, the sequences were grouped from similar sources. This resulted in few work units generating very large output files but most generating smaller than the average output files and less IO. In production, the order of the sequences in the sequence files is randomized so we don't see such wide variations in the size of the output files generated. Each work unit's sequence file is still only from a single source though, so there is likely to still be variations between batches in the size of the results files.

Seippel
[Oct 15, 2014 8:41:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
seippel
Former World Community Grid Tech
Joined: Apr 16, 2009
Post Count: 392
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Seeing lower than beta19 efficiencies

Have we got another science that favors a particular platform?

Getting .99+ on my windows machines.



I just did some quick database checks and to expand on what I mentioned above, the average size of results files does vary quite a bit by batch. For example, with a large number of results in batch 00000 has nearly twice the output generated as batch 00001. This isn't too surprising since batch 00000 compares sequences from reference source "A" with other sequences from reference source "A". Batch 00001 compares sequences from reference source "A" with sequences from reference source "B." When comparing stats from one machine to the next, you'll want to make sure you are also comparing work units from the same batch to control for those other factors.

Seippel
[Oct 15, 2014 8:54:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Seeing lower than beta19 efficiencies

Well let me retry:

1) On same device, after suspending ugm on all cores running in the 92-93 percent efficiency range for batches 17, 18, 33. Switched all cores to mcm as to ascertain if something was awry. After 1:43 hours all of batch 8304, started simultaneous, remained at 99 percent and better.

2) Booted to linux, and get 99.8 percent running all cores ugm.

Now increased write to disk under windows from 120 seconds to 300 seconds and restarted the ugm with laim off, so they pick up the new interval write setting i.e. at least checkpointing no sooner than 5 minutes have passed. Just to see if efficiency climbs back up again.
[Oct 15, 2014 9:17:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Seeing lower than beta19 efficiencies

Watching it in boincstats, see the efficiency drop right at the time of checkpointing. They did it -all- simultaneous in the same second, which may point at an io bottleneck. The second checkpoint were also written exactly simultaneous, 5:28 minutes after the previous. And the third checkpoint, all simultaneous 5:28 minutes later. Metronomic is appears, but the efficiency is creeping towards 94 percent. Will increase to 600 seconds write to disk, then if they eventually start saving checkpoints a-synchronous, things may improve. Lol, we could be needing staggered starting, just as with cep2 to optimize utilization and performance.
[Oct 15, 2014 9:36:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Seeing lower than beta19 efficiencies

So when suspending all the ready to start of ugm and let mcm take over gradually, boinctasks logged ever increasing efficiency as fewer ugm were running concurrent, the last one up to 95.78 percent. The 600 second write interval did add a percent to ugm performance, not much.

Now testing 1 ugm next to only mcm for the duration. Ugm needs about 3:35 hours on this node.
[Oct 16, 2014 7:24:34 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Seeing lower than beta19 efficiencies

Well, the data speaks for itself, 1 ugm alongside all others mcm, which run at 99+ percent efficiency, gives 98 percent for ugm, that's 4-5 percent better than all cores ugm.

7.22 ugm1 ugm1_ugm1_00033_0259_0 03:21:59 (03:18:00) 10/16/2014 12:41:47 PM 10/16/2014 12:42:18 PM 98.03 Reported: OK + 33.53 MB 67.50 MB

Next test, 2 quasi synchronous, still with 600 seconds write to disk. I'll have them though start with a 30 second delay, so the checkpoints would initially be out-of-sync.
[Oct 16, 2014 12:00:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 23   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread