Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 15
|
![]() |
Author |
|
Blount
Senior Cruncher Joined: Aug 19, 2005 Post Count: 463 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Finished and then this error:
7/23/2022 2:17:03 PM | World Community Grid | Computation for task ARP1_0034797_127_2 finished 7/23/2022 2:17:03 PM | World Community Grid | Output file ARP1_0034797_127_2_r1538955219_0 for task ARP1_0034797_127_2 absent 7/23/2022 2:17:03 PM | World Community Grid | Output file ARP1_0034797_127_2_r1538955219_1 for task ARP1_0034797_127_2 absent 7/23/2022 2:17:03 PM | World Community Grid | Output file ARP1_0034797_127_2_r1538955219_2 for task ARP1_0034797_127_2 absent 7/23/2022 2:17:03 PM | World Community Grid | Output file ARP1_0034797_127_2_r1538955219_3 for task ARP1_0034797_127_2 absent 7/23/2022 2:17:03 PM | World Community Grid | Output file ARP1_0034797_127_2_r1538955219_4 for task ARP1_0034797_127_2 absent 7/23/2022 2:17:03 PM | World Community Grid | Output file ARP1_0034797_127_2_r1538955219_5 for task ARP1_0034797_127_2 absent |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2156 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
This looks like a Computation error.
Please provide a link to the workunit. Something beginning with https://www.worldcommunitygrid.org/contribution/workunit/… |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 953 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Adri,
----------------------------------------If we're lucky, this will turn out to be an out of memory problem as we've seen reported elsewhere a couple of times recently (when it can't allocate 24MB for needs around a checkpoint); if we're unlucky, we might be heading for another stuck unit :-( Hopefully, the OP will provide the work unit number as you asked, or will report back on whether the web-site result report says SIGSEGV or something else... Cheers - Al. [Edit.] P.S. I noticed you just picked Mike up about malloc errors in another thread -- I decided to give him a pass on that one as it is (sort of) an abort, albeit not a SIGSEGV or a deliberate user/server intervention; perhaps I shouldn't have :-) [Edit 1 times, last edit by alanb1951 at Jul 24, 2022 12:50:58 AM] |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2156 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
P.S. I noticed you just picked Mike up about malloc errors in another thread -- I decided to give him a pass on that one as it is (sort of) an abort, albeit not a SIGSEGV or a deliberate user/server intervention; perhaps I shouldn't have :-) Al, Mike is a good guy. I feel like I'm often trying to teach him precisely where to look. ![]() EDIT: I think Mike hates me for that, because it seems he's never having a real conversation with me (and as long as there's no need, that's fine with me). It's just that I hate to see anyone providing false information or wrong explanations (and I think I know better ![]() Adri [Edit 2 times, last edit by adriverhoef at Jul 24, 2022 9:33:01 AM] |
||
|
Blount
Senior Cruncher Joined: Aug 19, 2005 Post Count: 463 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
https://www.worldcommunitygrid.org/contribution/results/248654890/log
Results log <core_client_version>7.16.11</core_client_version> <![CDATA[ <message> (unknown error) - exit code 3221225477 (0xc0000005)</message> <stderr_txt> INFO: Initializing INFO: No state to restore. Start from the beginning. Starting WRFMain [10:57:22] INFO: Checkpoint taken at 2019-03-12_06:00:00 [11:57:54] INFO: Checkpoint taken at 2019-03-12_12:00:00 [12:53:53] INFO: Checkpoint taken at 2019-03-12_18:00:00 [13:30:12] INFO: Checkpoint taken at 2019-03-13_00:00:00 [14:12:24] INFO: Checkpoint taken at 2019-03-13_06:00:00 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00007FF7E28F8E2D read attempt to address 0x5B4506F0 Engaging BOINC Windows Runtime Debugger... ******************** BOINC Windows Runtime Debugger Version 7.15.0 Dump Timestamp : 07/23/22 14:17:02 Install Directory : C:\Program Files\BOINC\ Data Directory : D:\ProgramData\BOINC Project Symstore : LoadLibraryA( D:\ProgramData\BOINC\dbghelp.dll ): GetLastError = 126 Loaded Library : dbghelp.dll LoadLibraryA( D:\ProgramData\BOINC\symsrv.dll ): GetLastError = 126 LoadLibraryA( symsrv.dll ): GetLastError = 126 LoadLibraryA( D:\ProgramData\BOINC\srcsrv.dll ): GetLastError = 126 LoadLibraryA( srcsrv.dll ): GetLastError = 126 LoadLibraryA( D:\ProgramData\BOINC\version.dll ): GetLastError = 126 Loaded Library : version.dll Debugger Engine : 4.0.5.0 |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2156 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Okay Blount, thanks. You have provided the link to the error log of your task, it helps a bit.
It would've been better if you had provided the link to the workunit. (You can find that link if you click on the name of your task on the Results page.) Example Now, from your Results log, what strikes one most is the exit code 3221225477. I believe I can only find two other posts on the forum with that exit code in combination with African Rainfall: Crystal Pellet's conclusion is: "The amount of memory is not the problem." However, "I think running ARP's together with 4-core VM's for LHC-ATLAS is sometimes conflicting." he says. And erich56 concludes with: "this a a fairly old PC. So I am not too surprised that it cannot meet the challenges which ARP is posing to a system." I cannot see much more in my crystall ball, Blount, it's been hazy all day. I wonder why. |
||
|
Kirel2
Advanced Cruncher United States Joined: Sep 24, 2014 Post Count: 99 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I had the same error code (3221225477) on this work unit: https://www.worldcommunitygrid.org/contribution/workunit/152511460
----------------------------------------Failure log: https://www.worldcommunitygrid.org/contribution/results/248677540/log ![]() |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 953 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Adri,
Good follow-up post above regarding Blount's result error, but as another one has shown up I thought I'd just add a little bit of data regarding possible completion of the work unit(s) concerned :-) Firstly, on Windows 3221225477 (0xc0000005) is specifically Access Violation (as per the report itself!); I suspect the scarcity of examples may have as much to do with how users post about errors as anything else (and the hex format report seems more commonly used here and elsewhere...) Now, as for the two work units with Errors now showing up in this thread... Unlike the [usually] identical stack traces we're used to for genuinely stuck unit SIGSEGVs on Linux, these two produced very different diagnostics, and Kirel2's stack trace is really strange! When I feel I need to reply to a post with only a result number, I've been using the "Swagger" GUI interface to the API to find out work unit names; in this case it seems Blount's WU is 152524445 (ARP1_0034797_127). Fortunately, that one seems to have a good chance of completing (only one No Reply!). However, Kirel2's WU is on it's last chance (having 2 No Reply, 1 User Aborted and one Pending Validation as well as the Error...) so ARP1_0019859_126 may end up stalled. I''ll watch out for these two work units in case either does stall... Hope that's of some use to someone :-) Cheers - Al. P.S. Your edit to your reply to my earlier post exactly tags my feelings on that sort of thing. However, I feel there are situations where it just isn't worth adding to the post count, and others (usually on other BOINC sites) where one knows a correction will be met with a hostile response... |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2156 Status: Recently Active Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I had the same error code (3221225477) on this work unit: https://www.worldcommunitygrid.org/contribution/workunit/152511460 That is indeed what we would be looking for, Kirel2. Thank you for that. This is basically what we see: Result name OS type Status Sent time Due / Return time So we can derive that it is possible to complete the task (because ARP1_0019859_126_1 completed without error and is awaiting its final validation). Now that we know that it is possible to complete the task and thanks to the fact that we know that it is normal for a task to complete without throwing errors, we can draw this conclusion: - There must have been some anomaly on the device that caused it to error out at the time. Without knowing anything else, apart from the name of the OS, you'd say it's as good as anyone's guess. I would like to add that in normal situations (with healthy hardware) the main direction in which one would start thinking in case of such an error on the device would be that of (the combination of a certain number of) running processes that are in each other's way. Let's just say that it's a possibility. [Edit 1 times, last edit by adriverhoef at Jul 25, 2022 9:23:28 AM] |
||
|
Kirel2
Advanced Cruncher United States Joined: Sep 24, 2014 Post Count: 99 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My crunching machine is just my home desktop; let me know if you need any hardware specs or any additional info about the system. This is the first time since the WCG test WUs started going out that I've noticed any kind of issue. My machine was able to process https://www.worldcommunitygrid.org/contribution/workunit/152550562 with no problems, for example.
----------------------------------------![]() |
||
|
|
![]() |