Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 38
|
![]() |
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I didn't see a separate thread for potential bad C Type WU's so I started this one.
All my C-Type pr WU's errored out on 3 different machines--- ts06_ b014_ pr34b0_ 0-- starbase2.command Error 4/10/10 16:42:54 4/10/10 16:54:12 0.00 0.0 / 0.0 ts06_ b014_ pr23b0_ 0-- starbase2.command Error 4/10/10 16:42:54 4/10/10 16:54:12 0.00 0.0 / 0.0 ts06_ b013_ pr78b1_ 0-- starbase2.command Error 4/10/10 16:42:35 4/10/10 16:54:12 0.00 0.0 / 0.0 ts06_ b011_ pr45a0_ 0-- starbase4.command Error 4/10/10 16:39:33 4/10/10 16:47:27 0.00 0.0 / 0.0 ts06_ b008_ pr45a0_ 0-- starbase4.command Error 4/10/10 16:36:44 4/10/10 16:47:27 0.00 0.0 / 0.0 ts06_ b006_ pr02b1_ 1-- starbase2.command Error 4/10/10 16:34:51 4/10/10 16:54:12 0.00 0.0 / 0.0 ts06_ b006_ pr02a1_ 0-- starbase2.command Error 4/10/10 16:34:49 4/10/10 16:54:12 0.00 0.0 / 0.0 ts06_ b005_ pr02a0_ 0-- starbase2.command Error 4/10/10 16:34:11 4/10/10 16:54:12 0.00 0.0 / 0.0 ts06_ b004_ pr91b1_ 0-- starbase2.command Error 4/10/10 16:34:10 4/10/10 16:54:12 0.00 0.0 / 0.0 ts06_ b003_ pr91b1_ 1-- starbase4.command Error 4/10/10 16:33:28 4/10/10 16:47:27 0.00 0.0 / 0.0 Make Up WU's: ts06_ b008_ pr23a1_ 3-- starbase1.command Error 4/10/10 17:22:58 4/10/10 17:30:36 0.00 0.0 / 0.0 ts06_ a013_ pr89a0_ 2-- starbase1.command Error 4/10/10 17:22:58 4/10/10 17:30:36 0.00 0.0 / 0.0 Same error log for each WU including wingmen on make up WU's: <core_client_version>6.10.17</core_client_version> <![CDATA[ <message> process exited with code 2 (0x2, -254) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. At line 10923 of file pbeq2.f Fortran runtime error: End of file </stderr_txt> ]]> Athlon64x2, 2GB, RHEL5.4, 6.10.17 Phenom 9600, 4GB, Fedora 12, 6.10.17 Phenom II 945, 4GB, Fedora 11, 6.10.17 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I posted here on the 10 or so that failed for me on all OS platforms.
|
||
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
All of the "pr" WUs that I received in this last shower crashed, too.
----------------------------------------However, they behaved differently on different machines. Sorry, but I am uncertain of which machine behaved in each way, because things happened quickly and I did not make written notes. All machines that got these WUs are Intel Yorkfield quads (Q9650, QX9650). 2 Run XP-32, 2 run XP-64, all run BOINC 6.2.19 32-bit. All machines are overclocked. On the 2 XP-64 machines, the WUs crashed immediately, with error 29, and some diagnostics in the error logs. See descriptions below for devices RJB-Q9650A and rjb-q9650c. On the 2 XP-32 machines, the WU ran for about 1-2 minutes, with the BOINC CPU Time field either empty or showing 00:00:00. Then CHARMM crashed, and Windows popped up a crash report window. On one machine, the window invited reporting the error to Microsoft, but on the other, it didn't. ([Edit]: Is this an XP system setting?). I think now that there would have been some Windows Error Report files (WER*) available until I clicked "Don't Send" ![]() The WCG error logs say "exit code 1282 (0x502)". ([Edit]: The initial time spent "running" may have been while Windows was dumping the crashed process memory image to disc.) More info for each machine: -------------------------------- Device RJB-Q9650A (XP-64): CHARMM crash popup windows probably did not occur. The error logs all show Error 0x1d, plus 12 lines of diagnostic parameters: The system cannot write to the specified device. (0x1d) - exit code 29 (0x1d) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. forrtl: severe (29): file not found, unit 30, file D:\BOINC_Data\slots\4\fort.30 Image PC Routine Line Source wcg_dddt2_charmm_ 00B15D6E Unknown Unknown Unknown wcg_dddt2_charmm_ 00B13028 Unknown Unknown Unknown wcg_dddt2_charmm_ 00ABCFB2 Unknown Unknown Unknown wcg_dddt2_charmm_ 00ABCBCF Unknown Unknown Unknown wcg_dddt2_charmm_ 00AAE8E1 Unknown Unknown Unknown wcg_dddt2_charmm_ 008EE66E Unknown Unknown Unknown wcg_dddt2_charmm_ 0056892F Unknown Unknown Unknown wcg_dddt2_charmm_ 004469FF Unknown Unknown Unknown wcg_dddt2_charmm_ 00445640 Unknown Unknown Unknown wcg_dddt2_charmm_ 0042D6E2 Unknown Unknown Unknown wcg_dddt2_charmm_ 00B03052 Unknown Unknown Unknown kernel32.dll 7D4E7D42 Unknown Unknown Unknown ------------- Device rjb-q9650c (XP-64): Same error (0x1d) as for RJB-Q9650A, and same diagnostics in the WCG error log. I think that rjb-q9650c threw popup windows, but there was no invitation to send error reports to MS. (Unsure of this). ------------- Device Rjb-q9650b (XP-32): Error log contained only the header, footer, and error code line >> - exit code 1282 (0x502) There may have been a popup window due to CHARMM crashing, but without an invitation to send an error report to Microsoft. A screen dump of such a popup is at http://i293.photobucket.com/albums/mm57/BlindFreddie/CHARMMcrash1.gif ----------- Device Rjb-q9650d (XP-32): The same short error logs as for Rjb-q9650b, with exit code 1282 (0x502). This one definitely threw popup windows, and a screen dump is at http://i293.photobucket.com/albums/mm57/BlindFreddie/CHARMMcrash4.gif ------------- The dicussion of this batch of error WUs is now split between this thread and thread It's raining Dengue, Hallelujah!. Check there before posting here. [Edit 2 times, last edit by Rickjb at Apr 11, 2010 5:17:43 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Here is another candidate for server abortion:
ts01_ a189_ pe0000_ 6-- - In Progress 10.04.10 13:09:46 13.04.10 08:21:46 0.00 0.0 / 0.0 ts01_ a189_ pe0000_ 5-- 617 Error 10.04.10 12:15:04 10.04.10 17:08:02 0.63 9.6 / 0.0 ts01_ a189_ pe0000_ 4-- 617 Error 10.04.10 08:32:22 10.04.10 13:09:44 1.53 9.6 / 0.0 ts01_ a189_ pe0000_ 3-- 617 Error 10.04.10 07:38:21 10.04.10 12:14:58 0.54 6.2 / 0.0 ts01_ a189_ pe0000_ 2-- 617 Error 10.04.10 05:49:30 10.04.10 07:38:18 0.39 5.8 / 0.0 ts01_ a189_ pe0000_ 1-- 617 Error 09.04.10 05:58:02 10.04.10 08:32:20 0.32 6.7 / 0.0 ts01_ a189_ pe0000_ 0-- 617 Error 09.04.10 05:58:01 10.04.10 05:49:28 0.48 6.6 / 0.0 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Only have 1 Wu that errored out: ts01_ b028_ se0000_ 2--
The system cannot write to the specified device. (0x1d) - exit code 29 (0x1d) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. ENERGY CHANGE TOLERANCE EXCEEDED Encountered error. Exiting. |
||
|
I need a bath
Senior Cruncher USA Joined: Apr 12, 2007 Post Count: 347 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm getting some pr retreads. Hopefully they will be server-aborted before I get to them, but I will not interfere with them otherwise.
----------------------------------------![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I posted here on the 10 or so that failed for me on all OS platforms. Sorry Brink, I didn't see your post earlier. Maybe one of the CA's can combine things to a single C-Type problem thread. |
||
|
Somervillejudson@netscape.net
Veteran Cruncher USA Joined: May 16, 2008 Post Count: 1065 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
yes 3 errors on latest batch. Hopefully that is all!
|
||
|
Trotador
Senior Cruncher Joined: Mar 26, 2009 Post Count: 154 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Same here, all -pr errored just at the starting second either in XP32 or Ubuntu 64.
----------------------------------------Edit total of 21 WUs ![]() [Edit 1 times, last edit by Trotador at Apr 10, 2010 7:48:15 PM] |
||
|
wplachy
Senior Cruncher Joined: Sep 4, 2007 Post Count: 423 Status: Offline |
The first 11 with the exception of the \slots\x directory all my results look the same. The next 9 are a different error code/results. Wingman and repair WUs either In Progress or error with most of the errors the same as mine. All errors occur at start and report 0.0 CPU hrs. 10 Vista 64 & 1 Win 7 Result Name: ts06_ a010_ pr56a0_ 1-- Result Name: ts06_ b003_ pr56a1_ 0-- Result Name: ts06_ b004_ pr23b1_ 0-- Result Name: ts06_ b005_ pr02b0_ 0-- Result Name: ts06_ b005_ pr91a1_ 1-- Result Name: ts06_ b009_ pr56a0_ 1-- Result Name: ts06_ b009_ pr56a1_ 1-- Result Name: ts06_ b010_ pr23a0_ 1-- Result Name: ts06_ b012_ pr34b1_ 0-- Result Name: ts06_ b012_ pr67b1_ 1-- Result Name: ts06_ b013_ pr89a1_ 1-- <core_client_version>6.2.28</core_client_version> <![CDATA[ <message> The system cannot write to the specified device. (0x1d) - exit code 29 (0x1d) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. forrtl: severe (29): file not found, unit 30, file C:\WCG\BOINC Data\slots\5\fort.30 Image PC Routine Line Source wcg_dddt2_charmm_ 00B15D6E Unknown Unknown Unknown Stack trace terminated abnormally. </stderr_txt> 9 - XP32 Result Name: ts06_ b008_ pr67a0_ 0-- Result Name: ts06_ b008_ pr56b0_ 1-- Result Name: ts06_ b008_ pr56b1_ 1-- Result Name: ts06_ b003_ pr89b0_ 0-- Result Name: ts06_ b003_ pr78a0_ 0-- Result Name: ts06_ b003_ pr02a1_ 1-- Result Name: ts06_ b002_ pr34b1_ 0-- Result Name: ts06_ b002_ pr34a1_ 0-- Result Name: ts06_ b002_ pr23a1_ 1-- <core_client_version>6.2.28</core_client_version> <![CDATA[ <message> - exit code 1282 (0x502) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. </stderr_txt> Edit: added 9 new WUs
Bill P
----------------------------------------![]() [Edit 1 times, last edit by wplachy at Apr 11, 2010 12:34:04 AM] |
||
|
|
![]() |