Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 7
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1041 times and has 6 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
New Library hitting the grid

Hi Crunchers!

We are putting an exciting new library on the grid in the next couple of days, which is focussing on molecules which are more synthetically accessible. Our initial molecules ( which we think should provide somewhere in the range of a week's worth of crunching on the grid) will range from very small (ie quick for you to complete - yay!) to a little larger than we have tried before. This means that there may be a few more failed WUs than you are used to expecting, but don't worry - we will be keeping an eye on the jobs that come back and trying to stay within the 'sweet spot' with our subsequent batches of molecules, so you guys are crunching as efficiently as possible.

Thanks for your continued support - you guys are awesome :D

Your Harvard CEP Team
[Jul 30, 2014 9:51:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
AgrFan
Senior Cruncher
USA
Joined: Apr 17, 2008
Post Count: 358
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Library hitting the grid

Did the new library start hitting the grid today? I am now seeing runtimes between 1 and 18 hours. Several workunits failed in Job #2 and were marked valid. Isn't Job #2 the most important job to complete? Does this mean I wasted hours of crunching time? I run my boxes 24x7 so checkpointing is not an issue for me but what about those members who do not? I would expect errors/resends to increase with these long runtimes. I may need to turn CEP2 off until the runtimes stabilize. The workunits have been running for quite some time in the 6-8 hour range. Can't the workunit sizing be checked first with a Beta test before unleashing them into the wild?

E224168_ 856_ I.63.C50H26N10O2S.00321331.0.set1d06_ 0-- XXXXX-XX Valid 7/30/14 12:48:41 7/30/14 13:53:39 1.02 / 1.04 13.9 / 13.9 <= failed 0xb Job #1
E224148_ 945_ I.64.C48H22N8O8.00226840.3.set1d06_ 0--XXXXX-XX Valid 7/30/14 03:16:19 7/30/14 15:31:37 12.08 / 12.21 166.8 / 166.8 <= failed 0xb Job #2
E224137_ 687_ I.64.C51F6H21N7.00390094.4.set1d06_ 0--XXXXX-XX Valid 7/29/14 22:46:38 7/30/14 17:28:07 18.00 / 18.65 214.4 / 214.4 <= time limit reached in Job #6
E224141_ 471_ I.62.C51F4H27N5O2.00421925.4.set1d06_ 0-- XXXXX-XX Valid 7/29/14 22:28:52 7/30/14 12:07:30 13.48 / 13.60 256.2 / 256.2 <= failed 0xb Job #2
E224136_ 716_ I.63.C45H23N9O8S.00121530.2.set1d06_ 0-- XXXXX-XX Valid 7/29/14 18:35:31 7/30/14 12:48:41 18.00 / 18.18 287.1 / 287.1 <= time limit reached in Job #2
E224135_ 139_ I.64.C47F6H21N7O4.00325289.0.set1d06_ 0-- XXXXX-XX Valid 7/29/14 17:45:00 7/30/14 03:16:19 9.39 / 9.48 204.5 / 204.5 <= failed 0xb Job #2
----------------------------------------
[Edit 6 times, last edit by AgrFan at Jul 31, 2014 12:34:13 AM]
[Jul 31, 2014 12:15:08 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Library hitting the grid

Hi,
Those batch numbers do not correspond to the new batches that were put in this evening - they probably won't hit the grid until about Friday afternoon. They are also ordered by number of electrons, which is the quantity against which the computational cost of a job scales meaning the initial jobs will be easier, and that the cost of a work unit is much easier for you guys to predict (which is also why I put this notice up in advance of the units hitting the grid).

I have to admit I am unsure whether the job numbers shown here are 0 indexed or 1 indexed, but the most important job in the old library is the third one to be run (a geometry optimization). This will change with the new batches, where we have placed the optimization as the first job.

With regards to beta testing, we do perform some beta testing and have a good idea of the size limits of the grid. However, given the diverse nature of machines on the grid, I wanted to give all the crunchers a heads up, so they could put the failure, or otherwise, or work units into the context of the project.

Your Harvard CEP Team
[Jul 31, 2014 2:02:54 AM]   Link   Report threatening or abusive post: please login first  Go to top 
AgrFan
Senior Cruncher
USA
Joined: Apr 17, 2008
Post Count: 358
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Library hitting the grid

Thanks for the update. I'm guessing these long units are cleanup tasks for the current work being crunched. I will watch for the new work units on Friday.
----------------------------------------
[Edit 1 times, last edit by AgrFan at Jul 31, 2014 2:14:45 AM]
[Jul 31, 2014 2:12:23 AM]   Link   Report threatening or abusive post: please login first  Go to top 
ca05065
Senior Cruncher
Joined: Dec 4, 2007
Post Count: 325
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Library hitting the grid

Recently CEP2 work units have had consistent run times between 4 and 6 hours. Since 29th July they have dropped to 40 to 55 minutes. All tasks end in job 1:
[06:36:28] Finished Job #0
[06:36:28] Starting job 1,CPU time has been restored to 415.243462.
Application exited with RC = 0xc0000005
[07:06:24] Finished Job #1
[07:06:24] Starting job 2,CPU time has been restored to 2192.157252.
[07:06:24] Skipping Job #2

All are valid in results status section except one error and a few in Pval and Pver.
[Jul 31, 2014 7:36:38 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Library hitting the grid

Hi ca05065,

Thanks for the info. We are reaching the end of the old library and so the jobs are getting tougher. I can see here that finishing on job 1 means that the optimisation of the molecular geometry failed.

My feeling about the new library is that the jobs towards the end of next week will run longer than you have been seeing recently, but the jobs at the start will complete pretty quickly. Since we are now starting with the geometry optimisation, if you get a job in which this does not converge, it will also be less time until you get a new job to replace it.

Your Harvard CEP Team
[Jul 31, 2014 10:09:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
cjslman
Master Cruncher
Mexico
Joined: Nov 23, 2004
Post Count: 2082
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Library hitting the grid

I'm used to seeing the CEP2 WUs last between 6-8 hours on my machine (I only crunch CEP2 on weekends), but now they're lasting a little over an hour shock !!! I'm not complaining... I like it biggrin !!!

CJSL

Crunching for a brighter future...
----------------------------------------
I follow the Gimli philosophy: "Keep breathing. That's the key. Breathe."
Join The Cahuamos Team


[Jul 31, 2014 11:55:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread