Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 15
Posts: 15   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3125 times and has 14 replies Next Thread
Blount
Senior Cruncher
Joined: Aug 19, 2005
Post Count: 463
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
files missing to upload(?) after completed

Finished and then this error:

7/23/2022 2:17:03 PM | World Community Grid | Computation for task ARP1_0034797_127_2 finished
7/23/2022 2:17:03 PM | World Community Grid | Output file ARP1_0034797_127_2_r1538955219_0 for task ARP1_0034797_127_2 absent
7/23/2022 2:17:03 PM | World Community Grid | Output file ARP1_0034797_127_2_r1538955219_1 for task ARP1_0034797_127_2 absent
7/23/2022 2:17:03 PM | World Community Grid | Output file ARP1_0034797_127_2_r1538955219_2 for task ARP1_0034797_127_2 absent
7/23/2022 2:17:03 PM | World Community Grid | Output file ARP1_0034797_127_2_r1538955219_3 for task ARP1_0034797_127_2 absent
7/23/2022 2:17:03 PM | World Community Grid | Output file ARP1_0034797_127_2_r1538955219_4 for task ARP1_0034797_127_2 absent
7/23/2022 2:17:03 PM | World Community Grid | Output file ARP1_0034797_127_2_r1538955219_5 for task ARP1_0034797_127_2 absent
[Jul 23, 2022 11:01:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2156
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: files missing to upload(?) after completed

This looks like a Computation error.
Please provide a link to the workunit.
Something beginning with https://www.worldcommunitygrid.org/contribution/workunit/…
[Jul 24, 2022 12:02:11 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 953
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: files missing to upload(?) after completed

Adri,

If we're lucky, this will turn out to be an out of memory problem as we've seen reported elsewhere a couple of times recently (when it can't allocate 24MB for needs around a checkpoint); if we're unlucky, we might be heading for another stuck unit :-(

Hopefully, the OP will provide the work unit number as you asked, or will report back on whether the web-site result report says SIGSEGV or something else...

Cheers - Al.

[Edit.] P.S. I noticed you just picked Mike up about malloc errors in another thread -- I decided to give him a pass on that one as it is (sort of) an abort, albeit not a SIGSEGV or a deliberate user/server intervention; perhaps I shouldn't have :-)
----------------------------------------
[Edit 1 times, last edit by alanb1951 at Jul 24, 2022 12:50:58 AM]
[Jul 24, 2022 12:33:18 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2156
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: files missing to upload(?) after completed

P.S. I noticed you just picked Mike up about malloc errors in another thread -- I decided to give him a pass on that one as it is (sort of) an abort, albeit not a SIGSEGV or a deliberate user/server intervention; perhaps I shouldn't have :-)

Al, Mike is a good guy. I feel like I'm often trying to teach him precisely where to look. cool

EDIT: I think Mike hates me for that, because it seems he's never having a real conversation with me (and as long as there's no need, that's fine with me). It's just that I hate to see anyone providing false information or wrong explanations (and I think I know betterbiggrin.) In the meantime I'm just trying to learn from the situation, trying to express myself in the right way and trying to avoid treading on long toes (= stepping on toes).

Adri
----------------------------------------
[Edit 2 times, last edit by adriverhoef at Jul 24, 2022 9:33:01 AM]
[Jul 24, 2022 8:56:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Blount
Senior Cruncher
Joined: Aug 19, 2005
Post Count: 463
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: files missing to upload(?) after completed

https://www.worldcommunitygrid.org/contribution/results/248654890/log


Results log


<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 3221225477 (0xc0000005)</message>
<stderr_txt>
INFO: Initializing
INFO: No state to restore. Start from the beginning.
Starting WRFMain
[10:57:22] INFO: Checkpoint taken at 2019-03-12_06:00:00
[11:57:54] INFO: Checkpoint taken at 2019-03-12_12:00:00
[12:53:53] INFO: Checkpoint taken at 2019-03-12_18:00:00
[13:30:12] INFO: Checkpoint taken at 2019-03-13_00:00:00
[14:12:24] INFO: Checkpoint taken at 2019-03-13_06:00:00


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00007FF7E28F8E2D read attempt to address 0x5B4506F0

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 7.15.0


Dump Timestamp : 07/23/22 14:17:02
Install Directory : C:\Program Files\BOINC\
Data Directory : D:\ProgramData\BOINC
Project Symstore :
LoadLibraryA( D:\ProgramData\BOINC\dbghelp.dll ): GetLastError = 126
Loaded Library : dbghelp.dll
LoadLibraryA( D:\ProgramData\BOINC\symsrv.dll ): GetLastError = 126
LoadLibraryA( symsrv.dll ): GetLastError = 126
LoadLibraryA( D:\ProgramData\BOINC\srcsrv.dll ): GetLastError = 126
LoadLibraryA( srcsrv.dll ): GetLastError = 126
LoadLibraryA( D:\ProgramData\BOINC\version.dll ): GetLastError = 126
Loaded Library : version.dll
Debugger Engine : 4.0.5.0
[Jul 24, 2022 5:00:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2156
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: files missing to upload(?) after completed

Okay Blount, thanks. You have provided the link to the error log of your task, it helps a bit.

It would've been better if you had provided the link to the workunit. (You can find that link if you click on the name of your task on the Results page.)
Example


Now, from your Results log, what strikes one most is the exit code 3221225477. I believe I can only find two other posts on the forum with that exit code in combination with African Rainfall:
  • this Crystal Pellet's post and
  • this erich56's post.

    Crystal Pellet's conclusion is: "The amount of memory is not the problem."
    However, "I think running ARP's together with 4-core VM's for LHC-ATLAS is sometimes conflicting." he says.

    And erich56 concludes with: "this a a fairly old PC. So I am not too surprised that it cannot meet the challenges which ARP is posing to a system."

    I cannot see much more in my crystall ball, Blount, it's been hazy all day. I wonder why.
  • [Jul 24, 2022 5:49:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
    Kirel2
    Advanced Cruncher
    United States
    Joined: Sep 24, 2014
    Post Count: 99
    Status: Offline
    Project Badges:
    Reply to this Post  Reply with Quote 
    Re: files missing to upload(?) after completed

    ----------------------------------------

    [Jul 24, 2022 9:23:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
    alanb1951
    Veteran Cruncher
    Joined: Jan 20, 2006
    Post Count: 953
    Status: Offline
    Project Badges:
    Reply to this Post  Reply with Quote 
    Re: files missing to upload(?) after completed

    Adri,

    Good follow-up post above regarding Blount's result error, but as another one has shown up I thought I'd just add a little bit of data regarding possible completion of the work unit(s) concerned :-)

    Firstly, on Windows 3221225477 (0xc0000005) is specifically Access Violation (as per the report itself!); I suspect the scarcity of examples may have as much to do with how users post about errors as anything else (and the hex format report seems more commonly used here and elsewhere...)

    Now, as for the two work units with Errors now showing up in this thread... Unlike the [usually] identical stack traces we're used to for genuinely stuck unit SIGSEGVs on Linux, these two produced very different diagnostics, and Kirel2's stack trace is really strange!

    When I feel I need to reply to a post with only a result number, I've been using the "Swagger" GUI interface to the API to find out work unit names; in this case it seems Blount's WU is 152524445 (ARP1_0034797_127). Fortunately, that one seems to have a good chance of completing (only one No Reply!).

    However, Kirel2's WU is on it's last chance (having 2 No Reply, 1 User Aborted and one Pending Validation as well as the Error...) so ARP1_0019859_126 may end up stalled.

    I''ll watch out for these two work units in case either does stall...

    Hope that's of some use to someone :-)

    Cheers - Al.

    P.S. Your edit to your reply to my earlier post exactly tags my feelings on that sort of thing. However, I feel there are situations where it just isn't worth adding to the post count, and others (usually on other BOINC sites) where one knows a correction will be met with a hostile response...
    [Jul 24, 2022 11:19:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
    adriverhoef
    Master Cruncher
    The Netherlands
    Joined: Apr 3, 2009
    Post Count: 2156
    Status: Recently Active
    Project Badges:
    Reply to this Post  Reply with Quote 
    Re: files missing to upload(?) after completed

    I had the same error code (3221225477) on this work unit: https://www.worldcommunitygrid.org/contribution/workunit/152511460

    That is indeed what we would be looking for, Kirel2. Thank you for that.

    This is basically what we see:

    Result name        OS type  Status             Sent time           Due / Return time
    ARP1_0019859_126_0 MSWin 10 No Reply 2022-07-15 13:04:36 2022-07-21 13:04:36
    ARP1_0019859_126_1 MSWin 7 Pending Validation 2022-07-15 13:04:32 2022-07-17 07:30:47
    ARP1_0019859_126_2 MSWin 10 No Reply 2022-07-21 13:04:42 2022-07-24 13:04:42
    ARP1_0019859_126_3 MSWin 10 User Aborted 2022-07-24 13:06:52 2022-07-24 13:08:55
    ARP1_0019859_126_4 MSWin 11 Error 2022-07-24 13:09:13 2022-07-24 17:12:32
    ARP1_0019859_126_5 MSWin 10 In Progress 2022-07-24 17:12:41 2022-07-27 17:12:41


    So we can derive that it is possible to complete the task (because ARP1_0019859_126_1 completed without error and is awaiting its final validation).

    Now that we know that it is possible to complete the task and thanks to the fact that we know that it is normal for a task to complete without throwing errors, we can draw this conclusion:
    - There must have been some anomaly on the device that caused it to error out at the time. Without knowing anything else, apart from the name of the OS, you'd say it's as good as anyone's guess.

    I would like to add that in normal situations (with healthy hardware) the main direction in which one would start thinking in case of such an error on the device would be that of (the combination of a certain number of) running processes that are in each other's way. Let's just say that it's a possibility.
    ----------------------------------------
    [Edit 1 times, last edit by adriverhoef at Jul 25, 2022 9:23:28 AM]
    [Jul 24, 2022 11:52:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
    Kirel2
    Advanced Cruncher
    United States
    Joined: Sep 24, 2014
    Post Count: 99
    Status: Offline
    Project Badges:
    Reply to this Post  Reply with Quote 
    Re: files missing to upload(?) after completed

    My crunching machine is just my home desktop; let me know if you need any hardware specs or any additional info about the system. This is the first time since the WCG test WUs started going out that I've noticed any kind of issue. My machine was able to process https://www.worldcommunitygrid.org/contribution/workunit/152550562 with no problems, for example.
    ----------------------------------------

    [Jul 24, 2022 11:58:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
    Posts: 15   Pages: 2   [ 1 2 | Next Page ]
    [ Jump to Last Post ]
    Post new Thread