Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 59
Posts: 59   Pages: 6   [ Previous Page | 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 9885 times and has 58 replies Next Thread
hunter1978
Advanced Cruncher
United States
Joined: Apr 24, 2010
Post Count: 110
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Nov 5, 2013 v7.22 [ Issues Thread ]

Got two 7.22 beta units. At start both said 5.25 hrs. run time. First one 10.59 minutes to finish. Second one running for 25 minutes with 5 hrs. to go. Go figure? On an i7-3770K.
----------------------------------------

[Nov 6, 2013 1:18:23 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Nov 5, 2013 v7.22 [ Issues Thread ]

I've put down the estimates being really off due to this post from the second beta test.
[Nov 6, 2013 1:25:55 AM]   Link   Report threatening or abusive post: please login first  Go to top 
gomeyer
Senior Cruncher
USA
Joined: Jul 11, 2008
Post Count: 161
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Nov 5, 2013 v7.22 [ Issues Thread ]

Twelve v7.22 wu's on 5 machines, both win32 and linux64. (4 on one linux64 machine at the same time.) All are running very smoothly so far.
----------------------------------------

[Nov 6, 2013 1:31:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Nov 5, 2013 v7.22 [ Issues Thread ]

received 9 tasks, OS is windows 8 64bit, everything went ok - no errors, suspending tasks works ok, after restarting computer tasks continued without error
[Nov 6, 2013 2:31:11 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Rickjb
Veteran Cruncher
Australia
Joined: Sep 17, 2006
Post Count: 666
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Nov 5, 2013 v7.22 [ Issues Thread ]

Haven't been all the way round the "farm" yet, but let's start with my "console" machine:
i7-970 / W7-x64 Pro / BOINC 7.0.64 x64 Service install, LAIM ON, running FAAH, CEP2 + betas.

Have 3 betas in the queue:

One is a v7.21 resend (BETA_BETA_9999982_0788_3) & seems to be running OK.
For some reason, a fresh copy of the v7.21 software was downloaded with it, tho that should have already been present from yesterday.

The other 2 WUs are v7.22, and were repeatedly doing the exit with zero status/restart thing when I resumed checking the machine after my 24-hourly ZZZ-period and morning cuppa.
I decided to suspend them before aborting them - they are still suspended.

IT SEEMS THAT THE PROBLEM OF THESE WUs FAILING TO RESPOND TO AN ORDER FROM BOINC MANAGER TO SUSPEND HAS NOT BEEN FIXED.

Sure, they're now showing in my BOINC Manager as suspended, but they did not actually suspend!
They are no longer shown in Windows Task Manager, but if they had suspended properly, they should still be there and showing 0% CPU share.
Have a look at the BOINC Manager Event Log:

This is what was happening before I intervened:
> 6/11/2013 11:30:20 AM | World Community Grid | Task BETA_BETA_9999987_0374b_1 exited with zero status but no 'finished' file
> 6/11/2013 11:30:20 AM | World Community Grid | If this happens repeatedly you may need to reset the project.
> 6/11/2013 11:30:20 AM | World Community Grid | Restarting task BETA_BETA_9999987_0374b_1 using beta17 version 722 in slot 10

Now I suspend _0374b_1:
> 6/11/2013 11:30:21 AM | World Community Grid | task BETA_BETA_9999987_0374b_1 suspended by user
> 6/11/2013 11:30:22 AM | World Community Grid | Resuming task FAHV_x3AVHbINfbB_0301948_0537_0 using fahv version 706 in slot 3

Now _0374b_1 crashes, 9 sec after I issued the suspend:
> 6/11/2013 11:30:30 AM | World Community Grid | Task BETA_BETA_9999987_0374b_1 exited with zero status but no 'finished' file
> 6/11/2013 11:30:30 AM | World Community Grid | If this happens repeatedly you may need to reset the project.

Now _0374b_1 does not restart, is still shown as Suspended in boincmgr Tasks, and has disappeared from Windows Task Manager. Next entry in Event Log is:
> 6/11/2013 11:32:13 AM | World Community Grid | [checkpoint] result FAHV_x3AVHbINfbB_0301948_0121_0 checkpointed


Next, I'll check my other 4 machines confused
--------
[Edit]: This issue exposes a hole in the specs for the BOINC client software: it does not seem to check that tasks actually suspend when they are commanded to do so, and then its various assumptions about the state of a task that fails to obey get out of sync with reality. Whether this happens often enough to warrant fixing BOINC is up to the BOINC programmers and WCG. [/Edit]
----------------------------------------
[Edit 2 times, last edit by Rickjb at Nov 6, 2013 5:19:58 AM]
[Nov 6, 2013 2:33:20 AM]   Link   Report threatening or abusive post: please login first  Go to top 
rbotterb
Senior Cruncher
United States
Joined: Jul 21, 2005
Post Count: 401
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Nov 5, 2013 v7.22 [ Issues Thread ]

My first Beta 7.22 WU finished OK. Estimated time was 13+ hours, but actual complete time was .55 hours - quite a difference. I still have thee Beta 7.22 WUs in process, we'll see how long those three take to finish.
[Nov 6, 2013 2:48:06 AM]   Link   Report threatening or abusive post: please login first  Go to top 
BobCat13
Senior Cruncher
Joined: Oct 29, 2005
Post Count: 295
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Nov 5, 2013 v7.22 [ Issues Thread ]

AMD Athlon64 X2 6000
Win XP SP3
2GB RAM
Boinc 7.2.26

Tested suspend/resume: first time with LAIM on, task stopped within 1 second and resumed fine. Second time with LAIM off at 61.647% complete, task stopped within 1 second and unloaded from memory then resumed dropping to 0.500%, ran for approximately 7 minutes then progress jumped back to 61.647% and ran to completion without problems. Now waiting on wingman for validation.
[Nov 6, 2013 4:24:54 AM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Nov 5, 2013 v7.22 [ Issues Thread ]

For those of you experiencing the too many exits issue can you tell us how you have your software installed? Is it a service installation?

Also - does the problem occur immediately or only after you logout and and then back in? Have you noticed any patterns like that?
[Nov 6, 2013 5:14:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
KWSN - A Shrubbery
Master Cruncher
Joined: Jan 8, 2006
Post Count: 1585
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Nov 5, 2013 v7.22 [ Issues Thread ]

service installation, immediately.
----------------------------------------

Distributed computing volunteer since September 27, 2000
[Nov 6, 2013 5:15:35 AM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1953
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Nov 5, 2013 v7.22 [ Issues Thread ]

For those of you experiencing the too many exits issue can you tell us how you have your software installed? Is it a service installation?

Also - does the problem occur immediately or only after you logout and and then back in? Have you noticed any patterns like that?
All my hosts are service installations and the problem appears on its own, the WCG/BOINC stuff is usually not touched at all on them, as it usually is just running. I aborted the bad tasks on three machines that I got access to today, have to wait for the others to (hopefully) time out...

Ralf
----------------------------------------

[Nov 6, 2013 5:54:58 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 59   Pages: 6   [ Previous Page | 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread