Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 20
Posts: 20   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2179 times and has 19 replies Next Thread
gj82854
Advanced Cruncher
Joined: Sep 26, 2022
Post Count: 104
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least one task ending in Error, Error Log apparently showing successful result

Are there any errors showing up on version 6 of the kernel or later releases of libc? All the errors here seem to be on Version 5 of the Linux kernel and older libc releases.
[Feb 6, 2025 1:25:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
MarkH
Advanced Cruncher
United States of America
Joined: May 16, 2020
Post Count: 56
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least one task ending in Error, Error Log apparently showing successful result

Hello all. For what it's worth, I've had at least two ARP units end with "Computation Error" in the last day or so. MCM units remain unaffected. I only run 1 ARP unit at a time, so I can't show a long-term trend myself. Maybe the logs would tell someone with more knowledge something.
----------------------------------------
"That science of the people, by the people, for the people, shall not perish from the Earth."
[Feb 7, 2025 3:50:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2160
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least one task ending in Error, Error Log apparently showing successful result

What did your wingmen do with their ARP1-tasks, Mark, did they turn in a valid result? And in what generation were the workunits?

Adri
[Feb 7, 2025 10:56:40 AM]   Link   Report threatening or abusive post: please login first  Go to top 
MarkH
Advanced Cruncher
United States of America
Joined: May 16, 2020
Post Count: 56
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least one task ending in Error, Error Log apparently showing successful result

Hello, Adri.

I'm sorry I don't understand what you mean by "wingmen", so I cannot answer your first question. (I'm not technically adept in the WCG infrastructure). Are those the multiple copies being run (xxx_1, xxx_2, etc.)?

Here are the ARP WU's that failed with "Computation Error":

https://www.worldcommunitygrid.org/contribution/workunit/660721703
ARP1_0018981_136_1

https://www.worldcommunitygrid.org/contribution/workunit/655572704
ARP1_0029596_133_3
----------------------------------------
"That science of the people, by the people, for the people, shall not perish from the Earth."
[Feb 8, 2025 3:09:12 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2160
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least one task ending in Error, Error Log apparently showing successful result

Hi Mark, I will explain.
Thanks for supplying the link to the workunits, this makes it so much easier!
If you look at the URL www.worldcommunitygrid.org/contribution/workunit/660721703, you'll notice the last three parts:
contribution / workunit / 660721703, where 660721703 is the numerical ID of a workunit.
If you inspect a workunit, you'll see - in the first case of workunit 660721703 - this, essentially:

ARP1_0018981_136 [*1]

Result name [*2]   OS type              Status      Sent time           Due / Return time   CPUtime/Elapsed
ARP1_0018981_136_0 Microsoft Windows 10 In Progress 2025-02-06 13:27:37 2025-02-09 13:27:37 -/-
ARP1_0018981_136_1 Microsoft Windows 10 Error 2025-02-06 13:27:38 2025-02-06 20:51:51 6.99/6.99
ARP1_0018981_136_2 Microsoft Windows 10 In Progress 2025-02-06 20:51:55 2025-02-09 20:51:55 -/-


(*1) This is the name of the workunit.
(*2) This lists the tasks (or results) that make up the workunit.
You can see that this workunit consists of three tasks, of which each name is beginning with the name of the workunit (see *1).
Each task has its own suffix: _0, _1 and _2.
Since your computer (or client) ran the task with suffix _1 (ARP1_0018981_136_1 in full), the other tasks are each running on somebody else's client; they are your wingmen.

Since (both) your wingmen (not running the task with suffix _1) in workunit 660721703 haven't returned their results at this moment, we have but to wait what they will deliver.

Adri
[Feb 8, 2025 10:38:26 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2160
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least one task ending in Error, Error Log apparently showing successful result

Hi Mark,
Looking at the error log of your task, the error seems to revolve around the words "Access Violation (0xc0000005) at address 0x02759D59". I've found this post (#619575) in which "Access Violation" is 'translated' into the Linux universe wording "Segmentation violation". We'll see what the other clients (your wingmen) are saying …

Adri
[Feb 8, 2025 10:57:43 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7665
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least one task ending in Error, Error Log apparently showing successful result

A "segmentation violation" is an indication of some type of memory problem. It is not very specific.

Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Feb 8, 2025 3:53:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2160
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least one task ending in Error, Error Log apparently showing successful result

Alright!
Mark, we have a Pending Validation now:
Result name        OS type  Status   Sent time           Due / Return time   CPUtime/Elapsed Claimed/Granted
ARP1_0018981_136_0 MSWin 10 No Reply 2025-02-06 13:27:37 2025-02-09 13:27:37 -/- -/-
ARP1_0018981_136_1 MSWin 10 Error 2025-02-06 13:27:38 2025-02-06 20:51:51 6.99/6.99 281.3/0
ARP1_0018981_136_2 MSWin 10 No Reply 2025-02-06 20:51:55 2025-02-09 20:51:55 -/- -/-
ARP1_0018981_136_3 MSWin 11 P. Val. 2025-02-09 13:27:42 2025-02-10 12:28:32 13.29/16.87 672.6/0
ARP1_0018981_136_4 MSWin 11 In Prog. 2025-02-09 20:51:58 2025-02-12 20:51:58 -/- -/-

This would in general mean your Error has nothing to do with the workunit (ARP1_0018981_136), because one wingman of yours (the one with suffix _3) returned a result that wasn't in error, so there's a strong indication that it has something to do with (the configuration of) your computer, I'm afraid.

Adri

PS For the sake of completeness, three minutes ago three tasks from the workunit were marked 'Valid':
Result name        OS type  Status   Sent time           Due / Return time   CPUtime/Elapsed Claimed/Granted
ARP1_0018981_136_0 MSWin 10 No Reply 2025-02-06 13:27:37 2025-02-09 13:27:37 -/- -/-
ARP1_0018981_136_1 MSWin 10 Error 2025-02-06 13:27:38 2025-02-06 20:51:51 6.99/6.99 281.3/0
ARP1_0018981_136_2 MSWin 10 Valid 2025-02-06 20:51:55 2025-02-11 07:05:47 34.58/34.97 1,192.6/741.3
ARP1_0018981_136_3 MSWin 11 Valid 2025-02-09 13:27:42 2025-02-10 12:28:32 13.29/16.87 672.6/741.3
ARP1_0018981_136_4 MSWin 11 Valid 2025-02-09 20:51:58 2025-02-11 14:53:09 19.15/20.16 810/741.3

----------------------------------------
[Edit 1 times, last edit by adriverhoef at Feb 11, 2025 3:36:48 PM]
[Feb 10, 2025 1:41:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1320
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least one task ending in Error, Error Log apparently showing successful result

It also happens to me very rarely that an ARP1-task ran into an error during runtime.
Example
https://www.worldcommunitygrid.org/contribution/workunit/657952170
https://www.worldcommunitygrid.org/contribution/workunit/657429818
I also suppose this is caused to memory access failures.
Probably software developed and tested on a stanalone system with one single task.
As long as t happens not too ofyen, nothing to worry about.
----------------------------------------
[Edit 1 times, last edit by Crystal Pellet at Feb 10, 2025 3:33:02 PM]
[Feb 10, 2025 3:29:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2160
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least one task ending in Error, Error Log apparently showing successful result

Hi Crystal Pellet,
Just noticed that one of your wingmen returned their result much too late and I suspect that they will get a nasty surprise when they look into their Results:
Result name        OS type  Status   Sent time           Due / Return time   CPUtime/Elapsed Claimed/Granted
ARP1_0022688_140_0 MSWin 11 Too Late 2025-01-31 07:00:58 2025-02-09 16:56:49 18.1/94.65 498.8/0
ARP1_0022688_140_1 MSWin 10 Valid 2025-01-31 07:00:57 2025-02-02 07:28:41 17.12/17.67 639.6/615.8
ARP1_0022688_140_2 MSWin 10 Error 2025-02-06 07:01:27 2025-02-06 15:24:43 5.49/5.49 253.7/0
ARP1_0022688_140_3 MSWin 11 Valid 2025-02-06 15:25:28 2025-02-08 12:59:55 5.58/11.8 592/615.8

Adri
[Feb 11, 2025 3:46:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 20   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread