Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 19
Posts: 19   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1179 times and has 18 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Stuck task?

Hmm, this is indeed bizarre. The RAIDing shouldn't be a problem. I guess you probably already had a look at our Tips and Tricks sheet linked in the footer? Maybe the IBM WCG crew can chime in on this one.
Best wishes from
Your Harvard CEP team
[Sep 12, 2012 2:30:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
armstrdj
Former World Community Grid Tech
Joined: Oct 21, 2004
Post Count: 695
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck task?

Rpugh,

Whey you say 124 hours cpu time do you mean elapsed time or is it really using the cpu? While one of these is stuck what processess show up in the task manager that start with wcgrid_cep2? Also are any of these tasks getting cpu time? One thing you can check to see what is going on is look in the BOINC data directory and fild the slots directory that the task is running in and take a look at stderr.txt? The BOINC data dir on Windows 7 is typically "C:\ProgramData\BOINC", whihc I believe ProgramData is hidden, inside there you should see a directory "slots" and inside that there are numbered directories. One of these is the current working directory for CEP2. stderr.txt should be locked so you have to open it with notepad. If there are any contents can you post them here?

Thanks,
armstrdj
[Sep 13, 2012 1:53:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Rpugh
Cruncher
Joined: Nov 16, 2004
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck task?

Hi armstrdj,

I have no idea why I said 124 hours CPU. I meant, of course, 124 hours elapsed! You've hit on the issue - a check on the tasks shows the CEP2 processes are using no CPU time.

At the moment I have 4 BOINC tasks in my system. 3 of them are for HCC and show 13% CPU each. The CEP2 task has spawned 2 processes:
wcgrid_cep2_6.40_windows_intelx86*32 00% CPU 1,688k memory
wcgrid_cep2_qchem_6.40_windows_intelx86*32 00% CPU 124k memory

Contents of stderr.txt:

INFO: No state to restore. Start from the beginning.
[22:10:35] Number of jobs = 16
[22:10:35] Starting job 0,CPU time has been restored to 0.000000.

I've left the CEP2 task in the system in case there are any further diagnostics you want me to perform.

Thanks,

Ray
[Sep 14, 2012 12:47:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Stuck task?

Hi everyone
I have the same problem. The project of the CEP2 takes several days to finish, sometimes it says it has about 30% of progress, then restart the machine, and said it has 1.2% of progress. This happens only on this project.
I DOES end at some point. It just takes several days (it restart about 2 or more times per file).
This is a recurrent promblem. I have seen this behavior everytime a file is downloaded.

I have 8 GB of Ram and allow wcg to use up to 4Gb.

Oh!! One more thing. the files of this project use to do the following:
It says it has 2hrs of progress (of 8h total) showing 20%. Then (i believe after restarting) it shows 20 mins out of 1:40 hrs, showing 20%, after a while, it shows 20 mins out of 8h. (not exactly this times, but is the idea)
----------------------------------------
[Edit 2 times, last edit by Former Member at Sep 20, 2012 5:23:38 PM]
[Sep 20, 2012 5:17:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Yarensc
Advanced Cruncher
USA
Joined: Sep 24, 2011
Post Count: 134
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck task?

You might not be making it to the second checkpoint, do you have it set so the messages section displays when a checkpoint is reached? Mine do the first checkpoint around 1% but the next ones are much farther apart (around 30% for the second one)
[Sep 23, 2012 4:20:09 AM]   Link   Report threatening or abusive post: please login first  Go to top 
DadX
Advanced Cruncher
Joined: Sep 9, 2006
Post Count: 56
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck task?

On one of my machines, an old warhorse P4, Boinc manager locks up after a few hours and I have to exit all Boinc processing to resolve the problem. This affects CEP2 tasks more than anything else I run because of the long intervals between check points.
Of course now I just close Boinc manager and leave the processes to run in the background and I don't run CEP2 on machines that may be shutdown while a CEP2 WU is running.

Thanks Yarensc, You've answered my question before I asked. Why does the progress for CEP2 reset to ~3% when I restart Boinc. Because I haven't made it to the 2nd check point!!!


----------------------------------------

[Sep 23, 2012 6:52:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Stuck task?

Thanks everybody for resolving this!
Best wishes from
Your Harvard CEP team
[Sep 28, 2012 2:13:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Rpugh
Cruncher
Joined: Nov 16, 2004
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck task?

Well it's not exactly resolved. All we've established is the CEP2 tasks in my system are stuck not using any CPU time. Would really appreciate someone suggesting diagnostics I can run to find out what these tasks are waiting on.
[Sep 30, 2012 11:35:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
armstrdj
Former World Community Grid Tech
Joined: Oct 21, 2004
Post Count: 695
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck task?

Do you use the BOINC cpu throttle setting? If so try running a CEP2 workunit with the throttle off (use 100% cpu) and see if this makes a difference. Also CEP2 is unique in that it forks a second process to run the research application and it could be that AV software is stoping this executable, however this typically results in an error and not a hung workunit.

Thanks,
armstrdj
[Oct 1, 2012 2:20:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 19   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread