Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 30
|
![]() |
Author |
|
gb009761
Master Cruncher Scotland Joined: Apr 6, 2005 Post Count: 2987 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
orangepeel13, if all of your SCC jos have failed so far, please can you post the error message(s) - i.e., the logs and your BOINC Event Log - both the excerpt at/around when the SCC jobs try to run/abort and also the first couple of dozen lines from the top (i.e., your configuration), as then, we may be able to find out as to why you're haiving a zero-success rate with this project.
----------------------------------------![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
One of the key suspects is security software preventing loading and or saving files off a new science application.
|
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1957 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
There is/are also Win7 machine(s) returning errored WU's... I am seeing this too with almost all WUs that I have currently in PVa jail...ETA: e.g., one of them, Microsoft Windows 7, Enterprise x64 Edition, Service Pack 1 Also quite a few where the wingmen are running older versions of Linux... Ralf ![]() [Edit 1 times, last edit by TPCBF at Jan 30, 2017 9:14:00 PM] |
||
|
orangepeel13
Cruncher USA Joined: Jul 22, 2014 Post Count: 11 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
orangepeel13, if all of your SCC jos have failed so far, please can you post the error message(s) - i.e., the logs and your BOINC Event Log - both the excerpt at/around when the SCC jobs try to run/abort and also the first couple of dozen lines from the top (i.e., your configuration), as then, we may be able to find out as to why you're having a zero-success rate with this project. Well, after failing 6 - 10 tasks on each of the machines, the tasks started to complete and validate. Don't know what happened, but I am glad they are working now. It may have been a file download problem, thee are several error message in the BOINC event log: Sat 28 Jan 2017 02:21:27 AM EST | | Project communication failed: attempting access to reference site Sat 28 Jan 2017 02:21:27 AM EST | World Community Grid | Temporarily failed download of scc1_image04_7.08.tga: transient HTTP error Sat 28 Jan 2017 02:21:27 AM EST | World Community Grid | Started download of scc1_image05_7.08.tga Sat 28 Jan 2017 02:21:28 AM EST | | Internet access OK - project servers may be temporarily down. The errors all looked like: <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> app_version download error: couldn't get input files: <file_xfer_error> <file_name>wcgrid_scc1_vina_7.08_x86_64-pc-linux-gnu</file_name> <error_code>-120 (RSA key check failed for file)</error_code> <error_message>signature verification failed</error_message> </file_xfer_error> </message> ]]> |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
When a new project is launched, each device gets a new chance to prove itself for the new project. For devices that return nothing but errors, it will still take a bit before they reach the point where they are no longer sent work units at all. This can skew the proportion of repair jobs in the beginning hours of a new project, but it should reach a more normal state fairly quickly. Hmm, still happening 9 days later to a wingman with the same device "signature":SCC1_ 0000009_ Bct-A_ 19363_ 0-- Microsoft Windows 8.1 Enterprise x64 Edition, (06.03.9600.00) 708 Error 2/4/17 12:47:42 2/4/17 12:50:07 0.00 71.2 / 0.0 <core_client_version>7.2.47</core_client_version> <![CDATA[ <message> couldn't start app: CreateProcess() failed - A required privilege is not held by the client. (0x522) |
||
|
seippel
Former World Community Grid Tech Joined: Apr 16, 2009 Post Count: 392 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
When a new project is launched, each device gets a new chance to prove itself for the new project. For devices that return nothing but errors, it will still take a bit before they reach the point where they are no longer sent work units at all. This can skew the proportion of repair jobs in the beginning hours of a new project, but it should reach a more normal state fairly quickly. Hmm, still happening 9 days later to a wingman with the same device "signature": My apologies, the part of the post where I say they wouldn't be sent work at all anymore wasn't accurate. Even machines that return nothing but errors will still get a very small number of work units a day in order to test if they are fixed. Also, as your machines become trusted for this project, they won't need wingman to verify them as often, although they will still sometimes get a wingman from random verification and they may get selected as a wingman from someone else who needs verification. Seippel |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
When a new project is launched, each device gets a new chance to prove itself for the new project. For devices that return nothing but errors, it will still take a bit before they reach the point where they are no longer sent work units at all. This can skew the proportion of repair jobs in the beginning hours of a new project, but it should reach a more normal state fairly quickly. Hmm, still happening 9 days later to a wingman with the same device "signature":Thanks, Al, that explains why I keep on noticing the occasional case. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
It's a very old rule... Quota goes down to 1 per day if a device keeps on returning errors for an app. Non-error and it goes up to 2 4 8 etc back and forth. Guess it is in some old FAQ by undersigned.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Raising this issue again, because I'm sad to notice that 10 wingman machines, out of a recent download of 21 units, had errored with either
couldn't start app: CreateProcess() failed - A required privilege is not held by the client. (0x522) or couldn't start app: Can't get shared memory segment name: shmget() failed In all these cases, the OS type was Microsoft Windows 8.1, Enterprise x64 Edition, (06.03.9600.00). Again, I suspect that one or two machine farms are causing these errors (the wingman Sent Times were all one of two times identical to the second), and they must be large groups of machines for me to keep noticing them. It's sad for 2 reasons, they're erroring instead of producing useful work, and they are Replication 2 thereby causing 2 trusted machines to work unnecessarily on the same unit. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I Have to agree with tony, 90% of all repair jobs I get, over a dozen today, are for this same reason. This is across all projects, not just SCC1
----------------------------------------[Edit 1 times, last edit by Former Member at Jun 5, 2017 10:25:38 PM] |
||
|
|
![]() |