Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 9
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 858 times and has 8 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Error code 161 out of nowhere

Hi there. I have had errors of various kinds with the BOINC agent before, mostly claiming RAM overruns.

Today, though, it looks like a WU cut short for a different reason: error 161. I looked it up and found that it relates to having a faulty client_state.xml. Can someone advise me as to why this might be the case and what I need to do to remedy this problem?? Many thanks in advance!!

Error log below.

<core_client_version>6.2.28</core_client_version>
<![CDATA[
<stderr_txt>
Calling initGraphics()
INFO: No state to restore. Start from the beginning.
Calling initGraphics()
called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
<file_name>E000682_042C_006x00214_1_2</file_name>
<error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
<file_name>E000682_042C_006x00214_1_3</file_name>
<error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
[Jun 9, 2009 4:08:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Error code 161 out of nowhere

sevenveils,

Something must have interfered with writing a good client_state.xml from memory, maybe during a result download. A source is conflicting file locking by overzealous Antivirus programs, so given the BOINC datadir is pretty well sand boxed, you can tell it to exclude that zone from scanning (find path in BOINC startup message log). Doing a period full memory / disk scan I've scheduled anyhow.

Now to remedy, the question is if you have many jobs in buffer. If the troublesome job is still listed in the task view of BOINC Manager, abort it. Suggest to stop work fetch (project tab of BOINC Manager), suspend any none-started jobs, let the ones running finish, then do a project reset. That will tell the servers to reassign the work not started.

More radical on last step is to stop BOINC completely and delete the client_state.xml and the back up copies in the data_dir and restart BOINC. That creates a new clean set.

The (partial?) remedy is coming with the next client version. There is really a whole lot of reading/writing to this client_state.xml, if I remember well sometimes 10x a second. Part of that is split off and moved to the slots where the progress data is kept for jobs underway. Am testing 6.6, but that one has known bugs such a unstoppable client, so no point in trying, if you volunteer.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jun 9, 2009 7:38:46 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Error code 161 out of nowhere

Hi there. I have had errors of various kinds with the BOINC agent before, mostly claiming RAM overruns.

Today, though, it looks like a WU cut short for a different reason: error 161. I looked it up and found that it relates to having a faulty client_state.xml. Can someone advise me as to why this might be the case and what I need to do to remedy this problem?? Many thanks in advance!!

You've made a small error then you looked-up the error-code, -161 is "ERR_NOT_FOUND" == "inconsistent client state", it's not inconsistent client_state.xml

Most often it's coming from whatever project-application you're running, that the application suddenly can't find one of more of it's files, and gives -161 "ERR_NOT_FOUND". If my recollections isn't too fuzzy, Human Proteome Folding 2 seems to be more prone to spit-out this error than other applications...

Virus-scanners is often the culprit, so disabling scanning of BOINC data-directory & sub-directories is a good suggestion.

Not so common, but it can also be due to buggy application, or buggy wu there one or more files are missing or corrupt on server. Transfer-errors will be catched by BOINC-client on download, while corruption on disk before startup can be guarded against if WCG has enabled verifying of files (can't check from current location if they're doing this or not...).

A corrupt client_state.xml is unlikely, but is a small possibility. If this is the reason, you'll immediately at startup of BOINC-client run into problems, so if you're example been running the erroring task for some time and it craps-out, without shutting-down BOINC-client between, it's not due to client_state.xml
----------------------------------------


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
[Jun 9, 2009 9:41:01 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Error code 161 out of nowhere

sevenveils,

Something must have interfered with writing a good client_state.xml from memory, maybe during a result download. A source is conflicting file locking by overzealous Antivirus programs, so given the BOINC datadir is pretty well sand boxed, you can tell it to exclude that zone from scanning (find path in BOINC startup message log). Doing a period full memory / disk scan I've scheduled anyhow.

Problems writing client_state.xml will show-up in the log, and if can't write on multiple attempts, BOINC-client should exit. On startup, if client_state.xml is write-protected, BOINC-client will terminate after waiting 30 seconds or something.

Now to remedy, the question is if you have many jobs in buffer. If the troublesome job is still listed in the task view of BOINC Manager, abort it. Suggest to stop work fetch (project tab of BOINC Manager), suspend any none-started jobs, let the ones running finish, then do a project reset. That will tell the servers to reassign the work not started.

More radical on last step is to stop BOINC completely and delete the client_state.xml and the back up copies in the data_dir and restart BOINC. That creates a new clean set.

Going by his quoted log-snippets, he's already reported this task to server, so...

The (partial?) remedy is coming with the next client version. There is really a whole lot of reading/writing to this client_state.xml, if I remember well sometimes 10x a second. Part of that is split off and moved to the slots where the progress data is kept for jobs underway. Am testing 6.6, but that one has known bugs such a unstoppable client, so no point in trying, if you volunteer.

v6.6.xx slows-down how often writes, while v6.10.xx should move the checkpoint-writings to the slots-directories. Writings due to uploads/downloads and new/reporting tasks will continue as before.

WCG using a ton of small files for uploads/downloads doesn't help here, and you'll likely get performance-degradations due to WCG sooner than by running other projects with tasks 1/10th as long so you've got 10x more tasks...
----------------------------------------


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
----------------------------------------
[Edit 1 times, last edit by Ingleside at Jun 9, 2009 10:08:55 AM]
[Jun 9, 2009 10:08:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Error code 161 out of nowhere

sevenveils,

Something must have interfered with writing a good client_state.xml from memory, maybe during a result download. A source is conflicting file locking by overzealous Antivirus programs, so given the BOINC datadir is pretty well sand boxed, you can tell it to exclude that zone from scanning (find path in BOINC startup message log). Doing a period full memory / disk scan I've scheduled anyhow.

Problems writing client_state.xml will show-up in the log, and if can't write on multiple attempts, BOINC-client should exit. On startup, if client_state.xml is write-protected, BOINC-client will terminate after waiting 30 seconds or something.

What's the relevance to momentary file locking by AV to scan it? Do all problems actually log, are all standard log flags on?

Now to remedy, the question is if you have many jobs in buffer. If the troublesome job is still listed in the task view of BOINC Manager, abort it. Suggest to stop work fetch (project tab of BOINC Manager), suspend any none-started jobs, let the ones running finish, then do a project reset. That will tell the servers to reassign the work not started.

More radical on last step is to stop BOINC completely and delete the client_state.xml and the back up copies in the data_dir and restart BOINC. That creates a new clean set.

Going by his quoted log-snippets, he's already reported this task to server, so...

Yes, and still the result can list in the task view of BOINC (which, to kick that open door), should cause more warning messages in a default config.

The (partial?) remedy is coming with the next client version. There is really a whole lot of reading/writing to this client_state.xml, if I remember well sometimes 10x a second. Part of that is split off and moved to the slots where the progress data is kept for jobs underway. Am testing 6.6, but that one has known bugs such a unstoppable client, so no point in trying, if you volunteer.

v6.6.xx slows-down how often writes, while v6.10.xx should move the checkpoint-writings to the slots-directories. Writings due to uploads/downloads and new/reporting tasks will continue as before.

WCG using a ton of small files for uploads/downloads doesn't help here, and you'll likely get performance-degradations due to WCG sooner than by running other projects with tasks 1/10th as long so you've got 10x more tasks...

Not sure how projects that have always few seconds / minutes tasks compare to projects here, HCMD2 the shortest with a mean of 2.45 hours and climbing, all others ~ 7 hours each. If there were a performance concern, the technicians would surely already have looked at this and remedied with yesterday 428,000 results having validated
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jun 9, 2009 10:32:28 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Error code 161 out of nowhere

Thanks muchly to you both for your answers! Good to know that something as central as client_state.xml isn't the problem, since I've got 2 long WUs that will finish in the next couple hours. I'd hate for them to come up errors too at the end!

I've taken the BOINC data directory out of my Avast auto-Shields--hopefully that will prevent any unwanted alteration/interference that could result in errors. Of course, a partially corrupted WU is something that a mere peon like me can do nothing about! wink

Here's the message log--sorry for not including it in my first post. It starts with my restarting my machine, then one WU finishing (at the appropriate time), and then the problem WU finishing too early right afterwards for some reason, and missing its two output files.....

6/8/2009 7:00:03 PM|World Community Grid|Restarting task E000681_827C_006w04711_1 using cep1 version 632
6/8/2009 7:00:03 PM|World Community Grid|Restarting task E000682_042C_006x00214_1 using cep1 version 632
6/8/2009 7:02:36 PM|World Community Grid|Computation for task E000681_827C_006w04711_1 finished
6/8/2009 7:02:37 PM|World Community Grid|Starting E000690_745C_003z08500_1
6/8/2009 7:02:37 PM|World Community Grid|Starting task E000690_745C_003z08500_1 using cep1 version 632
6/8/2009 7:02:38 PM|World Community Grid|Started upload of E000681_827C_006w04711_1_0
6/8/2009 7:02:38 PM|World Community Grid|Started upload of E000681_827C_006w04711_1_1
6/8/2009 7:02:38 PM|World Community Grid|Sending scheduler request: To fetch work. Requesting 17181 seconds of work, reporting 0 completed tasks
6/8/2009 7:02:44 PM|World Community Grid|Scheduler request succeeded: got 1 new tasks
6/8/2009 7:02:46 PM|World Community Grid|Started download of E000692_312C_00400520f_004005.crd.gzb
6/8/2009 7:02:46 PM|World Community Grid|Started download of E000692_312C_00400520f_00400520f.rkrun.gzb
6/8/2009 7:02:47 PM|World Community Grid|Finished download of E000692_312C_00400520f_004005.crd.gzb
6/8/2009 7:02:47 PM|World Community Grid|Finished download of E000692_312C_00400520f_00400520f.rkrun.gzb
6/8/2009 7:02:47 PM|World Community Grid|Started download of E000692_312C_00400520f_004005.ewald
6/8/2009 7:02:47 PM|World Community Grid|Started download of E000692_312C_00400520f_0040.top.gzb
6/8/2009 7:02:48 PM|World Community Grid|Finished download of E000692_312C_00400520f_004005.ewald
6/8/2009 7:02:48 PM|World Community Grid|Finished download of E000692_312C_00400520f_0040.top.gzb
6/8/2009 7:02:48 PM|World Community Grid|Started download of E000692_312C_00400520f_masses_cep.inp.gzb
6/8/2009 7:02:48 PM|World Community Grid|Started download of E000692_312C_00400520f_004005.crystal
6/8/2009 7:02:49 PM|World Community Grid|Finished download of E000692_312C_00400520f_masses_cep.inp.gzb
6/8/2009 7:02:49 PM|World Community Grid|Finished download of E000692_312C_00400520f_004005.crystal
6/8/2009 7:02:49 PM|World Community Grid|Started download of E000692_312C_00400520f_par_all22_prot_cep.inp.gzb
6/8/2009 7:02:49 PM|World Community Grid|Computation for task E000682_042C_006x00214_1 finished
6/8/2009 7:02:49 PM|World Community Grid|Output file E000682_042C_006x00214_1_2 for task E000682_042C_006x00214_1 absent
6/8/2009 7:02:49 PM|World Community Grid|Output file E000682_042C_006x00214_1_3 for task E000682_042C_006x00214_1 absent
----------------------------------------
[Edit 1 times, last edit by Former Member at Jun 9, 2009 5:28:23 PM]
[Jun 9, 2009 5:23:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Error code 161 out of nowhere

To clone hair, this was what I looked at before and you probably did too, sevenveils:
ERR_NOT_FOUND -161

This happens when you have an inconsistent client_state.xml file. Files aren't written to it.
Task not found would be the error message.


http://boincfaq.mundayweb.com/index.php?language=1&view=77

Very clearly the .xml suffix, but if the issue is not returning, please mark the Opening post title by inserting [RESOLVED]

(who knows, maybe did the wiki author make a mistake too ;>)

Added:

The Output file absent

6/8/2009 7:02:49 PM|World Community Grid|Output file E000682_042C_006x00214_1_2 for task E000682_042C_006x00214_1 absent
6/8/2009 7:02:49 PM|World Community Grid|Output file E000682_042C_006x00214_1_3 for task E000682_042C_006x00214_1 absent

Have occasionally been reported.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 2 times, last edit by Sekerob at Jun 9, 2009 6:20:59 PM]
[Jun 9, 2009 5:29:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Error code 161 out of nowhere

Yes indeed, that's what Google brought me to!
Please see above message log I edited in, though--looks like it may indeed be more of a "NOT FOUND" error...
Though the reason that would happen is still totally unclear to me.
[Jun 9, 2009 5:31:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Error code 161 out of nowhere

To clone hair, this was what I looked at before and you probably did too, sevenveils:
ERR_NOT_FOUND -161

This happens when you have an inconsistent client_state.xml file. Files aren't written to it.
Task not found would be the error message.


http://boincfaq.mundayweb.com/index.php?language=1&view=77

Very clearly the .xml suffix, but if the issue is not returning, please mark the Opening post title by inserting [RESOLVED]

(who knows, maybe did the wiki author make a mistake too ;>)

Well, I didn't look on any WIKI, I looked directly on the source-code instead. smile


A variation of the -161 is this:

<core_client_version>5.10.45</core_client_version>
<![CDATA[
<stderr_txt>

Model crashed: 
Leaving CPDN_Main::Monitor...
called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
<file_name>hadcm3l_pnw_ckgx_2000_1_000012765_14_1.zip</file_name>
<error_code>-161</error_code>
</file_xfer_error>

</message>
]]>


This particular error was due to an application-bug, and had nothing to do with client_state.xml at all...

... well, ok, "application didn't create the file it was supposed to create" does mean you'll have an "inconsistent" client_state.xml, but it's not a BOINC-client-error, and client does auto-recover from it so...
----------------------------------------


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
[Jun 9, 2009 7:32:28 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread