Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 15
Posts: 15   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2053 times and has 14 replies Next Thread
PMH_UK
Veteran Cruncher
UK
Joined: Apr 26, 2007
Post Count: 741
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Est. Duration - huge increase for OPN1 and wild variations for OPNG

Today new OPN1 tasks on all PCs have estimated durations around 10 times normal but those running appear normal duration.

OPNG units in recent days have had wildly varying durations, 92 hours one batch then 76 minutes next, all completed around an hour (on slow GPU) or less (slowish GPUs).

Anyone else seeing this?

Paul.
----------------------------------------
Paul.
[Sep 27, 2022 3:33:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 1978
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Est. Duration - huge increase for OPN1 and wild variations for OPNG

Today new OPN1 tasks on all PCs have estimated durations around 10 times normal but those running appear normal duration.
Paul,

I have noticed this, too. Instead of lasting 1½ à 2 hours, they would now need 11 à 12 hours.

As a consequence of this, I'd figure that your computer will not try to get any CPU work for some time if your work cache is of average size. This might be the case in post 676861, but I can't tell for sure - if at all possible - without any further insight in the problem described there.
----------------------------------------
[Edit 2 times, last edit by adriverhoef at Sep 27, 2022 5:07:05 PM]
[Sep 27, 2022 4:43:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1842
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Est. Duration - huge increase for OPN1 and wild variations for OPNG

Today new OPN1 tasks on all PCs have estimated durations around 10 times normal but those running appear normal duration.
Paul,

I have noticed this, too. Instead of lasting 1½ à 2 hours, they would now need 11 à 12 hours.

I only noticed that some OPN1 WUs after download show +1 day of estimated remaining runtime, but once they start, they just finish in a matter of a few hours, depending on the crunching hosts.
MCM1 WUs also seem to show a somewhat increased initial estimate, but that finish only slightly longer than usual. Could be within the variation of different batches...

But in general, it would be nice to get another update on what's going on by someone in the know at Krembil...

Ralf
----------------------------------------

[Sep 27, 2022 5:02:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 1978
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Est. Duration - huge increase for OPN1 and wild variations for OPNG

UPDATE: Just got a new batch of MCM1 and OPN1, not more than five minutes ago; the estimated time of *ALL* uninitialized OPN1-tasks in my queue has now dropped to 9½ hours.

UPDATE 2: Also, just received an ARP1-task, with an estimated time - currently less than that of an OPN1-task wink - at only 9 hours, which is rather normal, as they last somewhere between 6 and 12 hours, depending on the number of concurrent running OPN1- and MCM1-tasks. I'm sure this will all blow over and it's next to nothing to worry about.
----------------------------------------
[Edit 1 times, last edit by adriverhoef at Sep 27, 2022 5:41:19 PM]
[Sep 27, 2022 5:16:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7219
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Est. Duration - huge increase for OPN1 and wild variations for OPNG

The MCM tasks show normal estimated times, but the OPN (CPU) tasks now show an estimated time of 19 hours. Even with that estimated time, I can tell they are running normally and should finish in their 3+ hour range. This machine has no GPU, so that is not a factor in the weird time estimates.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Sep 27, 2022 6:30:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
MarkH
Cruncher
United States of America
Joined: May 16, 2020
Post Count: 49
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
shock Re: Est. Duration - huge increase for OPN1 and wild variations for OPNG

Ref: Incredibly long completion time estimates.

Confirmed here in the US as well; here's some of the jobs I received:

MCM1_0191079_7809_0 = 11:26:00
MCM1_0191079_7590_0 = 9:05:23

OPN1_0114218_00711_0 = 16:02:56 (!)
OPN1_0114218_01041_0 + 1 Day, 14:31:23 (!)

Jobs run but complete much earlier, even on my slowest machine.
----------------------------------------
"That science of the people, by the people, for the people, shall not perish from the Earth."
[Sep 27, 2022 8:04:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 729
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Est. Duration - huge increase for OPN1 and wild variations for OPNG

Regarding OPN1 tasks and increased run time estimates...

There's nothing obvious about the work units in the current batch(es) that would suggest a reason for changing the rsc_fpops_est value in the workunit data sent to the client -- it's still the same target (receptor) and the jobs within the individual tasks sent out are fairly similar in nature!

As the FPOPS estimate appears to have been constant before the change, and now looks as if it's constant again (but over 6 times higher!) it looks like a configuration change somewhere. Does that number come from the scientists or from the WCG team??? As Ralf remarked, it would be nice to hear from "the source" :-)

And as for the MCM1 tasks - the FPOPS estimate there also seems to have changed from one constant value to another (over 2.5 times higher), then down to another that's only about twice the original -- again, looking at task parameters and data files there doesn't seem to be any obvious reason for the change... And as Sgt. Joe said, the displayed estimates don't seem to be badly out -- the client seems to adjust better than it does for OPN1.

For what it's worth, ARP1 tasks don't seem to always have the same FPOPS estimate anyway, and [so far] I've seen no evidence of a massive leap in said estimates -- however, I don't get many ARP1 tasks at the moment (and they're mostly retries!), so my sample may be flawed :-)

And [returning to the original post] I've not seen any wildly different OPNG estimates, but again my sample may be too small. or I just got lucky... That said, actual run times can vary a lot, depending on the complexity of the ligands being looked at.

Cheers - Al.

[Edit: added remark about OPNG run time and ligands]

P.S. Retries may still have the old FPOPS estimate, as it should be a feature of the work unit, not the specific task...
----------------------------------------
[Edit 1 times, last edit by alanb1951 at Sep 27, 2022 11:35:41 PM]
[Sep 27, 2022 11:31:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 1978
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Est. Duration - huge increase for OPN1 and wild variations for OPNG

If you ask me - and you didn't, I know laughing - then this is what can happen when a largish batch of tasks with much longer runtimes gets validated and this will have its influence on the tasks that are waiting to be sent from the BOINC-server.

Possible explanation:

Imagine the case where a large number of longer running OPN1-tasks from Android-devices are waiting to be validated. At some point in time, they'll start validating while results from 'fast' devices have to wait. In the meantime, the expected runtime of tasks (sent and waiting to be sent) in the serverqueue increases because of this, much so while they are being sent out. This is the effect that you see.

In practice:

I'm keeping a close watch on my validated results and suddenly there was a huge increase in OPN1-validations from 21:00 UTC 27-09-2022 for about three hours. It's just possible that they were held up by a 'largish' batch of longer running OPN1-tasks from Android-devices being validated.

Answering Al's question:

Does that number come from the scientists or from the WCG team???
In my opinion, the fluctuations in estimated times come from BOINC itself. In short: this is how BOINC works.

Additional effect:

In addition, what I'm seeing at the moment, is a considerable increase in credit being awarded to validated OPN1-tasks. Instead of about 80-90 credit being awarded for an OPN1-task, the new normal now seems to be 494.7 since 2022-09-28T09:06:58, but I already know that this will be temporary. devilish
----------------------------------------
[Edit 3 times, last edit by adriverhoef at Sep 28, 2022 12:36:52 PM]
[Sep 28, 2022 12:00:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7219
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Est. Duration - huge increase for OPN1 and wild variations for OPNG

My OPN(CPU) units are now down to estimated 9 hours. I suspect they will return to a more normal figure in a few days.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Sep 29, 2022 1:16:02 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 729
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Est. Duration - huge increase for OPN1 and wild variations for OPNG

Adri,

Thanks for highlighting the "loads of long-running units validating" possibility -- i had forgotten that the assessment of FPOPS varied [slightly] more frequently than there are major changes in supplied data -- it takes a major variation like this one to draw one's attention :-)
Answering Al's question:

Does that number come from the scientists or from the WCG team???
In my opinion, the fluctuations in estimated times come from BOINC itself. In short: this is how BOINC works.
If that is the explanation it would be a WCG issue in a way, but it would not be easy to resolve...

The setting of the FPOPS option is up to the work generator, and if it doesn't have some constants built in or an XML(?) file to parse for "configuration" it has to get it from somewhere else -- if it's doing it by looking at recent time consumed versus results counted without access to some sort of scaling information it will do exactly what we've just observed...

Unfortunately, the basic client-server communication (workunit and result packets) doesn't lend itself to applying that scaling factor, and digging extra stuff out of the database to try to make the estimate more precise would probably be a pain...

So all we can do is hope that it resolves quickly (whatever the cause) and that it doesn't happen again...

Cheers - Al.

P.S. I'd love to see some of those high-credit OPN1 tasks :-) -- most of mine self-verify so I get what I ask for, which is based on the reality of run-time and local benchmarks (rather than the FPOPS estimate) as I understand it...
[Sep 29, 2022 2:25:35 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 15   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread