Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 5
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 754 times and has 4 replies Next Thread
stoneysilence
Cruncher
Joined: May 2, 2007
Post Count: 10
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
WU's getting stuck and not completing

Since the server migration, I have been having a lot of issues with WU's getting stuck at 99%+some or 100% but still "running" but they are not using any cpu time. I've been running BOINC/WCG for over 10 years and this is the first time i've had any issues.

I can "Abort" the stuck WU's but then I lose the time I spent on them and they don't get done. And it's not very long before I will get another stuck WU. I don't want to be checking BOINC every day for stuck WU's.

Things I have done. Uninstalled BOINC. Reinstalled BOINC. Cleaned WCG Cache. Stopped all new WU's, finished them up and then removed WCG and added it again.

Specs: Windows 10
AMD FX-8370 OC'ed 4.2ghz
Corsair H100i CPU Cooler
AMD R9 280x
16GB Ram
512MB SSD
2 TB HDD

I have attached screenshots of my WU's stuck and one of the details of the stuck work units and one of my task manager.
https://www.dropbox.com/s/s19lg3dwna65gew/Scr...06-04%2023.03.52.jpg?dl=0
https://www.dropbox.com/s/et5dnputk1a3e14/Scr...06-04%2023.04.15.jpg?dl=0
https://www.dropbox.com/s/sc2x7fztaapf53h/Scr...06-04%2023.04.42.jpg?dl=0
https://www.dropbox.com/s/okiyunpth42s2mi/Scr...06-04%2023.14.47.jpg?dl=0
----------------------------------------
[Edit 3 times, last edit by stoneysilence at Jun 5, 2017 6:30:37 AM]
[Jun 5, 2017 6:27:25 AM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU's getting stuck and not completing

Am running 7.6.33 64 bit in a W8.1, W10 environments without issue, and FAIK, you're the first to report the matter, and it going across multiple sciences. Pretty sure the correlation, of it being post-migration related, is not causation. Usually getting stuck units to run again is by stopping the tasks in question for 30 seconds (With 'Leave non-GPU tasks in memory while suspended' off), and then resuming them again. You may verify in task manager these task(s) have left memory, they must, else they remain stuck. Certainly what you did to resolve the matter has erased all traces of a culprit, if it were any piece of BOINC.

When a task finishes, it wishes to do a little housekeeping, zipping up result files and such, the question then arising if something on the host is blocking that. E.g. check the security software, noting WCG swapped IP address during the migration, but you don't seem to have upload/download problems. Strongly recommend to set a scanning exception in the AV for the BOINC data directory and it's subs [Is Sandboxed], usually C:\ProgramData\BOINC.

As for the uninstall / reinstall, it's important to boot between these two steps, as special 'limited rights' boinc accounts are being created by the installer as part of the sandboxing. Uninstalling and then not booting does not remove those from Windows memory i.e. potentially still a polluted environment.

Momentarily can't think of other things to check.
[Jun 5, 2017 9:34:31 AM]   Link   Report threatening or abusive post: please login first  Go to top 
stoneysilence
Cruncher
Joined: May 2, 2007
Post Count: 10
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU's getting stuck and not completing

I think I fixed it. I had to uncheck leave in memory like you said. Then I suspended all tasks except the stuck ones. I forced Boinc to shut down (not just the gui). I checked my task manager and made sure everything was gone. However even after all that there were still 4 tasks for boinc in task manager. I forced exit those tasks. I then restarted boinc and it started working on those stuck ones again and completed the two 100% almost immediately and started finishing the others.
[Jun 6, 2017 2:47:15 AM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU's getting stuck and not completing

Stranger things happen, but science app processes getting orphaned is the strangest. My recommendation stands on setting scan excludes on the BOINC data dir, which does not stop in-memory scannin. Whilst, suspending the stuck tasks through BOINC Manager to see with LAIM off, would have proven if the client was still in control of those processes, without having to exit BOINC.

Right now the immediate issue is solved, but not the root cause of why they get stuck, but am sure it's something in your local environment. Report back is these stuckers return.
[Jun 6, 2017 6:58:31 AM]   Link   Report threatening or abusive post: please login first  Go to top 
stoneysilence
Cruncher
Joined: May 2, 2007
Post Count: 10
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU's getting stuck and not completing

Ok, so far no new stuck WU's. I put in an exception in ESET to exclude the Boinc data directory.
[Jun 6, 2017 11:27:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread