Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: The Clean Energy Project - Phase 2 Forum Thread: Error 0xe7 (= exit code 231 decimal) - new to me |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 9
|
Author |
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: |
[Edit] It seems like some defective WUs have slipped thru the WU generation script(s). I got 2. Same result logs from all returned wingmen's WUs that I examined.
----------------------------------------WU names: - E216970_900_I.47.C33F3H15N4O7.00016024.3.set1d06 - E216970_ 689_ I.47.C34F6H15N5O2.00054060.3.set1d06 Example log: --- Result Name: E216970_ 689_ I.47.C34F6H15N5O2.00054060.3.set1d06_ 7-- <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> All pipe instances are busy. (0xe7) - exit code 231 (0xe7) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. Error could not extract input. 17:53:00 (2316): called boinc_finish </stderr_txt> --- Error occurs during startup. For both "my" WUs, BOINC manager showed 8 sec Elapsed when they crashed. In one case, this behaviour threw up an error in the BOINC client, which I had noticed before: *** The number of WUs thought by the BOINC client/manager to be suspended, can get out of sync with reality *** BOINC manager on the machine that had one of these bad WUs now shows no suspended WUs in the cache, but when the client tries to fetch new work, the message is: > 17/11/2013 3:37:16 PM | World Community Grid | Not requesting tasks: some task is suspended via Manager I have closed and restarted the manager, but this did not help. I will have to restart the BOINC service ... and I have 4 x big long CEP2 WUs running . That fixed the situation previously. This is the first time I have seen this behaviour in BOINC client 7.2.28 (x64)(Win7 Pro x64), but I had it before with v7.0.64 (x64) too. This is the first instance me seeing this out-of-sync situation where I can give some details of user/WU actions that led to it happening. Basically, the situation arose because the WU had been suspended by me in BOINC manager, but was still actually running when it crashed. [Edit]: More details of events leading up to BOINC manager saying there were suspended tasks when wanting to fetch new work, but showing no tasks suspended in Tasks tab: Background: Members may be aware that the BOINC client will run any tasks that have status "Waiting to run" (WTR) ahead of any that are "Ready to start", with exceptions: - Tasks that BOINC deems to be "High priority", and - Tasks for a science application that has a max no. of running tasks set in an app_config.xml file. It is thus possible to keep the machine crunching the max no of CEP2 tasks wanted to run simultaneously, by setting the CEP2 WUs in the cache to WTR ahead of time. To set a task to WTR, you just need to force it to run for a very short time, and then suspend it (and resume the CEP2 task that is being actively run). I run CEP2 tasks until the manager shows about 3-5sec Elapsed when doing this. Now, at this stage of their execution, CEP2 tasks uncompress their input files into literally thousands of small datafiles, and much of this activity gets cached by the O/S - Windows in my case. Sometimes this causes a delay between clicking "Suspend" and the manager showing the task as "Suspended by user"(SBU), but sometimes SBU shows immediately. In both cases, the Elapsed time usually continues to tick upwards for a few seconds after the task has been suspended. Today, I saw in my work caches 2 CEP2 WUs that I suspected were problem kids because they were copies _8 and _5, so I looked them up in my Results Status pages. I ran them immediately, as above, to make space in my work caches to download new real CEP2 WUs. In one case, I hit "Resume" as soon as the manager said it was SBU, then moved the scrollbar up to show the running tasks. Shortly afterwards, the status of the bad WU was shown as "Computation error". The WU had crashed while it was suspended and WTR. For the other WU, I think I did not hit "Resume" after it showed as SBU, and it switched from there to "Computation error". Now .... I've forgotten which set of actions led to the out-of-sync condition, but I think that happened in the latter case, ie when the WU crashed while being shown as Suspended.[/Edit] Seems like the techs may have yet more problems in their "In" tray Hope they soon manage to fix the high server load problem, plus get MCM back under way & running smoothly, and that it produces meaningful useful results! [Edit 4 times, last edit by Rickjb at Nov 17, 2013 1:35:43 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
'new to me' is a red flag: http://www.worldcommunitygrid.org/forums/wcg/...key=exit+AND+code+AND+231
----------------------------------------32 hits, now 33 with this post. Had same error happen to my octo yesterday at about 5PM running 5 concurrent CEP2, hands-off standalone W7-64 with 7.2.23 client. This took off while a result was uploading, then followed by all CEP2 restarting with the usual. Then now scanning log, this series was showing multiple times, each time while a result was uploading: 29981 World Community Grid 11/16/2013 11:11:15 PM Computation for task E216970_599_I.47.C31F6H15N5O5.00089818.1.set1d06_4 finished 29982 World Community Grid 11/16/2013 11:11:15 PM Output file E216970_599_I.47.C31F6H15N5O5.00089818.1.set1d06_4_0 for task E216970_599_I.47.C31F6H15N5O5.00089818.1.set1d06_4 absent 29983 World Community Grid 11/16/2013 11:11:15 PM Output file E216970_599_I.47.C31F6H15N5O5.00089818.1.set1d06_4_1 for task E216970_599_I.47.C31F6H15N5O5.00089818.1.set1d06_4 absent 29984 World Community Grid 11/16/2013 11:11:15 PM Output file E216970_599_I.47.C31F6H15N5O5.00089818.1.set1d06_4_2 for task E216970_599_I.47.C31F6H15N5O5.00089818.1.set1d06_4 absent 29985 World Community Grid 11/16/2013 11:11:15 PM Output file E216970_599_I.47.C31F6H15N5O5.00089818.1.set1d06_4_3 for task E216970_599_I.47.C31F6H15N5O5.00089818.1.set1d06_4 absent 29986 World Community Grid 11/16/2013 11:12:08 PM Task E217002_997_I.47.C37H16N8OS.00033787.3.set1d06_0 exited with zero status but no 'finished' file 29987 World Community Grid 11/16/2013 11:12:08 PM If this happens repeatedly you may need to reset the project. 29988 World Community Grid 11/16/2013 11:12:08 PM Task E217014_170_I.47.C35F4H16N2O5S.00330286.2.set1d06_1 exited with zero status but no 'finished' file 29989 World Community Grid 11/16/2013 11:12:08 PM If this happens repeatedly you may need to reset the project. 29990 World Community Grid 11/16/2013 11:12:08 PM Task E217014_618_I.47.C39F4H16N2OS.00302194.0.set1d06_0 exited with zero status but no 'finished' file 29991 World Community Grid 11/16/2013 11:12:08 PM If this happens repeatedly you may need to reset the project. 29992 World Community Grid 11/16/2013 11:12:08 PM Task E216801_453_J.41.C32H16N6S2Se.00060053.3.set1d06_3 exited with zero status but no 'finished' file 29993 World Community Grid 11/16/2013 11:12:08 PM If this happens repeatedly you may need to reset the project. 29994 World Community Grid 11/16/2013 11:12:08 PM Restarting task E217002_997_I.47.C37H16N8OS.00033787.3.set1d06_0 using cep2 version 640 in slot 6 Each time, some what further down there was a red-liner 30012 11/16/2013 11:16:55 PM BOINC can't access Internet - check network connection or proxy configuration. Now this is to me typical Linux CEP2 craching, but on W7-64... can't remember when last that would have been. No such 'suspend' like messages. New work was backfilled to a total of 10 as specified in the device profile. Think it's time to boot everything, to include the router, as there are somewhere dirty pipes. Edit: But on the original 231 error, none fatality crashed on other restarts since that one. All the wingman suffered of same on this task, all southbound within the setup phase of up to .5 hours into the job. Result Name: E216970_ 599_ I.47.C31F6H15N5O5.00089818.1.set1d06_ 4-- <core_client_version>7.2.23</core_client_version> <![CDATA[ <message> All pipe instances are busy. (0xe7) - exit code 231 (0xe7) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. Error could not extract input. 23:11:13 (41684): called boinc_finish </stderr_txt> ]]> E216970_ 599_ I.47.C31F6H15N5O5.00089818.1.set1d06_ 7-- - In Progress 11/17/13 04:10:43 11/20/13 04:10:43 0.00 0.0 / 0.0 E216970_ 599_ I.47.C31F6H15N5O5.00089818.1.set1d06_ 6-- 640 Error 11/17/13 03:26:59 11/17/13 03:47:38 0.00 0.4 / 0.0 E216970_ 599_ I.47.C31F6H15N5O5.00089818.1.set1d06_ 5-- 640 Error 11/16/13 22:40:18 11/17/13 03:18:58 0.00 0.4 / 0.0 E216970_ 599_ I.47.C31F6H15N5O5.00089818.1.set1d06_ 4-- 640 Error 11/16/13 10:44:10 11/16/13 22:18:54 0.00 0.2 / 0.0 < moi E216970_ 599_ I.47.C31F6H15N5O5.00089818.1.set1d06_ 3-- 640 Error 11/16/13 07:59:29 11/16/13 10:29:09 0.00 0.3 / 0.0 E216970_ 599_ I.47.C31F6H15N5O5.00089818.1.set1d06_ 2-- 640 Error 11/16/13 07:35:23 11/16/13 07:44:47 0.00 0.3 / 0.0 E216970_ 599_ I.47.C31F6H15N5O5.00089818.1.set1d06_ 1-- - In Progress 11/15/13 14:46:48 11/25/13 14:46:48 0.00 0.0 / 0.0 E216970_ 599_ I.47.C31F6H15N5O5.00089818.1.set1d06_ 0-- 640 Error 11/15/13 14:29:57 11/16/13 07:14:45 0.00 0.5 / 0.0 [Edit 2 times, last edit by Former Member at Nov 17, 2013 10:00:39 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Will have to do a run-dry > detach > attach cycle to get rid of all old/expired project/beta junk... this showed after the boot cycle, which seems like a bad [head] in the cloud status. If there's a more intelligent way for clients to not want to re-download ancient useless files, please let me know.
364 World Community Grid 17-11-2013 11:21 Started download of wcgrid_cfsw_gfx_prod_windows_64.exe.6.12 365 World Community Grid 17-11-2013 11:21 Finished download of dddt2a_image06_6.40.tga 366 World Community Grid 17-11-2013 11:21 Finished download of dddt2a_image10_6.40.tga 367 World Community Grid 17-11-2013 11:21 Started download of hpf2.bbdep02.May.sortlib.gz 368 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2a_image06_6.40.tga using certificates 369 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2a_image06_6.40.tga 370 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2a_image10_6.40.tga using certificates 371 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2a_image10_6.40.tga 372 World Community Grid 17-11-2013 11:21 Started download of cfsw.baygame.db 373 World Community Grid 17-11-2013 11:21 Finished download of wcgrid_cfsw_gfx_prod_windows_64.exe.6.12 374 World Community Grid 17-11-2013 11:21 Started download of wcg_dddt2_charmm_6.40_windows_intelx86 375 World Community Grid 17-11-2013 11:21 [error] Unable to verify wcgrid_cfsw_gfx_prod_windows_64.exe.6.12 using certificates 376 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for wcgrid_cfsw_gfx_prod_windows_64.exe.6.12 377 World Community Grid 17-11-2013 11:21 Finished download of wcg_dddt2_charmm_6.40_windows_intelx86 378 World Community Grid 17-11-2013 11:21 Started download of dddt2_image13_6.40.tga 379 World Community Grid 17-11-2013 11:21 [error] Unable to verify wcg_dddt2_charmm_6.40_windows_intelx86 using certificates 380 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for wcg_dddt2_charmm_6.40_windows_intelx86 381 World Community Grid 17-11-2013 11:21 Finished download of dddt2_image13_6.40.tga 382 World Community Grid 17-11-2013 11:21 Started download of dddt2_image12_6.40.tga 383 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2_image13_6.40.tga using certificates 384 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2_image13_6.40.tga 385 World Community Grid 17-11-2013 11:21 Finished download of hpf2.bbdep02.May.sortlib.gz 386 World Community Grid 17-11-2013 11:21 Started download of dddt2_image09_6.40.tga 387 World Community Grid 17-11-2013 11:21 Finished download of cfsw.baygame.db 388 World Community Grid 17-11-2013 11:21 Finished download of dddt2_image12_6.40.tga 389 World Community Grid 17-11-2013 11:21 Started download of dddt2_image14_6.40.tga 390 World Community Grid 17-11-2013 11:21 Started download of dddt2_image08_6.40.tga 391 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2_image12_6.40.tga using certificates 392 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2_image12_6.40.tga 393 World Community Grid 17-11-2013 11:21 Finished download of dddt2_image09_6.40.tga 394 World Community Grid 17-11-2013 11:21 Started download of dddt2_image05_6.40.tga 395 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2_image09_6.40.tga using certificates 396 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2_image09_6.40.tga 397 World Community Grid 17-11-2013 11:21 Finished download of dddt2_image08_6.40.tga 398 World Community Grid 17-11-2013 11:21 Finished download of dddt2_image14_6.40.tga 399 World Community Grid 17-11-2013 11:21 Started download of dddt2_image10_6.40.tga 400 World Community Grid 17-11-2013 11:21 Started download of dddt2_image11_6.40.tga 401 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2_image14_6.40.tga using certificates 402 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2_image14_6.40.tga 403 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2_image08_6.40.tga using certificates 404 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2_image08_6.40.tga 405 World Community Grid 17-11-2013 11:21 Finished download of dddt2_image05_6.40.tga 406 World Community Grid 17-11-2013 11:21 Started download of dddt2_image07_6.40.tga 407 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2_image05_6.40.tga using certificates 408 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2_image05_6.40.tga 409 World Community Grid 17-11-2013 11:21 Finished download of dddt2_image10_6.40.tga 410 World Community Grid 17-11-2013 11:21 Finished download of dddt2_image11_6.40.tga 411 World Community Grid 17-11-2013 11:21 Started download of dddt2_image04_6.40.tga 412 World Community Grid 17-11-2013 11:21 Started download of dddt2_image06_6.40.tga 413 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2_image10_6.40.tga using certificates 414 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2_image10_6.40.tga 415 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2_image11_6.40.tga using certificates 416 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2_image11_6.40.tga 417 World Community Grid 17-11-2013 11:21 Finished download of dddt2_image07_6.40.tga 418 World Community Grid 17-11-2013 11:21 Started download of wcgrid_dsfl_vina_prod_x86.exe.6.25 419 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2_image07_6.40.tga using certificates 420 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2_image07_6.40.tga 421 World Community Grid 17-11-2013 11:21 Finished download of dddt2_image04_6.40.tga 422 World Community Grid 17-11-2013 11:21 Finished download of dddt2_image06_6.40.tga 423 World Community Grid 17-11-2013 11:21 Started download of hcmd2_image01_6.40.tga 424 World Community Grid 17-11-2013 11:21 Started download of dsfl_image19_6.25.tga 425 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2_image04_6.40.tga using certificates 426 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2_image04_6.40.tga 427 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2_image06_6.40.tga using certificates 428 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2_image06_6.40.tga 429 World Community Grid 17-11-2013 11:21 Finished download of dsfl_image19_6.25.tga 430 World Community Grid 17-11-2013 11:21 Started download of dddt2a_image09_6.40.tga 431 World Community Grid 17-11-2013 11:21 [error] Unable to verify dsfl_image19_6.25.tga using certificates 432 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dsfl_image19_6.25.tga 433 World Community Grid 17-11-2013 11:21 Finished download of wcgrid_dsfl_vina_prod_x86.exe.6.25 434 World Community Grid 17-11-2013 11:21 Finished download of hcmd2_image01_6.40.tga 435 World Community Grid 17-11-2013 11:21 Started download of dddt2a_image14_6.40.tga 436 World Community Grid 17-11-2013 11:21 Started download of dddt2a_image12_6.40.tga 437 World Community Grid 17-11-2013 11:21 [error] Unable to verify wcgrid_dsfl_vina_prod_x86.exe.6.25 using certificates 438 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for wcgrid_dsfl_vina_prod_x86.exe.6.25 439 World Community Grid 17-11-2013 11:21 [error] Unable to verify hcmd2_image01_6.40.tga using certificates 440 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for hcmd2_image01_6.40.tga 441 World Community Grid 17-11-2013 11:21 Finished download of dddt2a_image09_6.40.tga 442 World Community Grid 17-11-2013 11:21 Started download of dddt2a_image13_6.40.tga 443 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2a_image09_6.40.tga using certificates 444 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2a_image09_6.40.tga 445 World Community Grid 17-11-2013 11:21 Finished download of dddt2a_image14_6.40.tga 446 World Community Grid 17-11-2013 11:21 Started download of dddt2a_image08_6.40.tga 447 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2a_image14_6.40.tga using certificates 448 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2a_image14_6.40.tga 449 World Community Grid 17-11-2013 11:21 Finished download of dddt2a_image12_6.40.tga 450 World Community Grid 17-11-2013 11:21 Started download of dddt2a_image04_6.40.tga 451 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2a_image12_6.40.tga using certificates 452 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2a_image12_6.40.tga 453 World Community Grid 17-11-2013 11:21 Finished download of dddt2a_image13_6.40.tga 454 World Community Grid 17-11-2013 11:21 Finished download of dddt2a_image08_6.40.tga 455 World Community Grid 17-11-2013 11:21 Started download of dddt2a_image11_6.40.tga 456 World Community Grid 17-11-2013 11:21 Started download of wcgrid_cfsw_baygame_prod_gfx.exe.6.11 457 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2a_image13_6.40.tga using certificates 458 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2a_image13_6.40.tga 459 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2a_image08_6.40.tga using certificates 460 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2a_image08_6.40.tga 461 World Community Grid 17-11-2013 11:21 Finished download of dddt2a_image04_6.40.tga 462 World Community Grid 17-11-2013 11:21 Started download of dddt2a_image07_6.40.tga 463 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2a_image04_6.40.tga using certificates 464 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2a_image04_6.40.tga 465 World Community Grid 17-11-2013 11:21 Finished download of dddt2a_image11_6.40.tga 466 World Community Grid 17-11-2013 11:21 Started download of wcgrid_gfam_vina_prod_x86_64.exe.6.13 467 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2a_image11_6.40.tga using certificates 468 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2a_image11_6.40.tga 469 World Community Grid 17-11-2013 11:21 Finished download of dddt2a_image07_6.40.tga 470 World Community Grid 17-11-2013 11:21 Started download of wcg_hpf2_rosetta_6.40_windows_intelx86 471 World Community Grid 17-11-2013 11:21 [error] Unable to verify dddt2a_image07_6.40.tga using certificates 472 World Community Grid 17-11-2013 11:21 [error] Checksum or signature error for dddt2a_image07_6.40.tga 473 World Community Grid 17-11-2013 11:22 Finished download of wcgrid_cfsw_baygame_prod_gfx.exe.6.11 474 World Community Grid 17-11-2013 11:22 Finished download of wcgrid_gfam_vina_prod_x86_64.exe.6.13 475 World Community Grid 17-11-2013 11:22 Started download of wcg_hpf2_6.40.tga 476 World Community Grid 17-11-2013 11:22 [error] Unable to verify wcgrid_cfsw_baygame_prod_gfx.exe.6.11 using certificates 477 World Community Grid 17-11-2013 11:22 [error] Checksum or signature error for wcgrid_cfsw_baygame_prod_gfx.exe.6.11 478 World Community Grid 17-11-2013 11:22 [error] Unable to verify wcgrid_gfam_vina_prod_x86_64.exe.6.13 using certificates 479 World Community Grid 17-11-2013 11:22 [error] Checksum or signature error for wcgrid_gfam_vina_prod_x86_64.exe.6.13 480 World Community Grid 17-11-2013 11:22 Started download of wcg_dddt2a_charmm_6.40_windows_intelx86 481 World Community Grid 17-11-2013 11:22 Finished download of wcg_hpf2_rosetta_6.40_windows_intelx86 482 World Community Grid 17-11-2013 11:22 [error] Unable to verify wcg_hpf2_rosetta_6.40_windows_intelx86 using certificates 483 World Community Grid 17-11-2013 11:22 [error] Checksum or signature error for wcg_hpf2_rosetta_6.40_windows_intelx86 484 World Community Grid 17-11-2013 11:22 Started download of wcgrid_gfam_vina_prod_x86_64.exe.6.12 485 World Community Grid 17-11-2013 11:22 Finished download of wcg_hpf2_6.40.tga 486 World Community Grid 17-11-2013 11:22 Started download of dddt2a_image05_6.40.tga 487 World Community Grid 17-11-2013 11:22 [error] Unable to verify wcg_hpf2_6.40.tga using certificates 488 World Community Grid 17-11-2013 11:22 [error] Checksum or signature error for wcg_hpf2_6.40.tga 489 World Community Grid 17-11-2013 11:22 Finished download of wcgrid_gfam_vina_prod_x86_64.exe.6.12 490 World Community Grid 17-11-2013 11:22 [error] Unable to verify wcgrid_gfam_vina_prod_x86_64.exe.6.12 using certificates 491 World Community Grid 17-11-2013 11:22 [error] Checksum or signature error for wcgrid_gfam_vina_prod_x86_64.exe.6.12 492 World Community Grid 17-11-2013 11:22 Finished download of dddt2a_image05_6.40.tga 493 World Community Grid 17-11-2013 11:22 [error] Unable to verify dddt2a_image05_6.40.tga using certificates 494 World Community Grid 17-11-2013 11:22 [error] Checksum or signature error for dddt2a_image05_6.40.tga 495 World Community Grid 17-11-2013 11:22 Started download of beta16.baygame.db 496 World Community Grid 17-11-2013 11:23 Finished download of beta16.baygame.db 497 World Community Grid 17-11-2013 11:25 Finished download of wcg_dddt2a_charmm_6.40_windows_intelx86 498 World Community Grid 17-11-2013 11:25 [error] Unable to verify wcg_dddt2a_charmm_6.40_windows_intelx86 using certificates 499 World Community Grid 17-11-2013 11:25 [error] Checksum or signature error for wcg_dddt2a_charmm_6.40_windows_intelx86 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Sorry Rickjb, but one thing led to the next problem and the next problem and the next problem, basically revealing that not only things at WCG are broken, but with the 7.2.xx clients too.
----------------------------------------1) On detach > attach not succeeding to download the application files... all breaking with certificate errors 2) Whilst 1) is ongoing, work is downloaded and client it trying to start it, but the certificate fails, so it's cycling throug tasks (suspending work fetch, also stops attempts to get the application files... catch 22). 3) Because of the deferral and idle cores, the client fetches work from backup projects to occupy all cores, 8, not more than 8 equal the threads allowed to BOINC. 4) Once the WCG deferrals expire, new work is fetched from WCG and are prioritized over the backup project... the backup project tasks are being suspended by the client... WCG has work, so no need to run backup, but they crash again per point 2) 5) After the tasks of WCG crash, the backup project tasks do not resume... just sitting there with 'waiting to run', because WCG is in deferral with failed tasks to report on and on the vicious circle. 6) Suspending WCG, forcing the reporting to clear them, no amount of suspending the backup project and resuming sets the backup tasks in motion, to include restarting the client service... 8 cores idle. 7) Oh golly, though I have local prefs, the WCG detach somehow went to fetch the prefs from the BACKUP project which injected processors allowed = 0%. (Is zero not BOINC synonymous to "eat all you can for 9.99?"). Entered 100%, and now backup project is running. Hours wasted of my volunteer time. ... Too many riddles and ready for a walk in the countryside, what Aussies would call Out-Back. [Edit 1 times, last edit by Former Member at Nov 17, 2013 1:06:18 PM] |
||
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: |
Thanks, Sek.
I haven't found details of the crashes in the log files so far, but no other running tasks seemed to have been affected when these WUs crashed, eg no "exited with zero status" things. I updated details of the crash circumstances in the post above, and posted a link to it in the BOINC Agent Support thread. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Long walk, waited for the storm to pass and MCM 7.26 to be well past relaunch point. Cycled a detach/attach, minimum cache and got the same certificate errors, 8 tasks sitting there with download error blocking the work fetch from backup project, ergo an idling computer. Uninstalled and erased all BOINC, botted, and installed 7.2.23 as user, not service. Same thing. Set new data_dir location in AV global exceptions, Same thing.
7.26 mcm1 MCM1_0000099_0760_1 - (-) 0.000 100.000 - 19-11-2013 09:01 09d,22:53:43 Download error 7.26 mcm1 MCM1_0000099_0746_1 - (-) 0.000 100.000 - 19-11-2013 09:01 09d,22:53:43 Download error 7.26 mcm1 MCM1_0000099_0722_0 - (-) 0.000 100.000 - 19-11-2013 09:01 09d,22:53:43 Download error 7.26 mcm1 MCM1_0000099_0717_0 - (-) 0.000 100.000 - 19-11-2013 09:01 09d,22:53:43 Download error 7.26 mcm1 MCM1_0000099_0772_0 - (-) 0.000 100.000 - 19-11-2013 09:01 09d,22:53:43 Download error 7.26 mcm1 MCM1_0000099_0781_1 - (-) 0.000 100.000 - 19-11-2013 09:01 09d,22:53:43 Download error 7.26 mcm1 MCM1_0000099_0762_1 - (-) 0.000 100.000 - 19-11-2013 09:01 09d,22:53:43 Download error 7.26 mcm1 MCM1_0000099_0767_0 - (-) 0.000 100.000 - 19-11-2013 09:01 09d,22:53:43 Download error 14993 World Community Grid 19-11-2013 08:58 work fetch resumed by user 15002 World Community Grid 19-11-2013 09:01 [sched_op] Starting scheduler request 15003 World Community Grid 19-11-2013 09:01 Sending scheduler request: To fetch work. 15004 World Community Grid 19-11-2013 09:01 Requesting new tasks for CPU 15005 World Community Grid 19-11-2013 09:01 [sched_op] CPU work request: 48.47 seconds; 8.00 devices 15006 World Community Grid 19-11-2013 09:01 Scheduler request completed: got 8 new tasks 15007 World Community Grid 19-11-2013 09:01 [sched_op] Server version 701 15008 World Community Grid 19-11-2013 09:01 Project requested delay of 182 seconds 15009 World Community Grid 19-11-2013 09:01 [sched_op] estimated total CPU task duration: 117739 seconds 15010 World Community Grid 19-11-2013 09:01 [sched_op] Deferring communication for 00:03:01 15011 World Community Grid 19-11-2013 09:01 [sched_op] Reason: requested by project 15012 World Community Grid 19-11-2013 09:02 Started download of wcgrid_mcm1_7.26_windows_x86_64 15013 World Community Grid 19-11-2013 09:02 Started download of wcgrid_mcm1_graphics_prod_64.exe.7.26 15014 World Community Grid 19-11-2013 09:02 Started download of mcm1_image01_7.26.tga 15015 World Community Grid 19-11-2013 09:02 Finished download of mcm1_image01_7.26.tga 15016 World Community Grid 19-11-2013 09:02 Started download of mcm1_image02_7.26.tga 15017 World Community Grid 19-11-2013 09:02 [error] Unable to verify mcm1_image01_7.26.tga using certificates 15018 World Community Grid 19-11-2013 09:02 [error] Checksum or signature error for mcm1_image01_7.26.tga 15019 World Community Grid 19-11-2013 09:02 Finished download of wcgrid_mcm1_7.26_windows_x86_64 15020 World Community Grid 19-11-2013 09:02 Finished download of wcgrid_mcm1_graphics_prod_64.exe.7.26 15021 World Community Grid 19-11-2013 09:02 Finished download of mcm1_image02_7.26.tga 15022 World Community Grid 19-11-2013 09:02 Started download of mcm1_image03_7.26.tga 15023 World Community Grid 19-11-2013 09:02 Started download of mcm1_image04_7.26.tga 15024 World Community Grid 19-11-2013 09:02 Started download of mcm1_image05_7.26.tga 15025 World Community Grid 19-11-2013 09:02 [error] Unable to verify wcgrid_mcm1_7.26_windows_x86_64 using certificates 15026 World Community Grid 19-11-2013 09:02 [error] Checksum or signature error for wcgrid_mcm1_7.26_windows_x86_64 15027 World Community Grid 19-11-2013 09:02 [error] Unable to verify wcgrid_mcm1_graphics_prod_64.exe.7.26 using certificates 15028 World Community Grid 19-11-2013 09:02 [error] Checksum or signature error for wcgrid_mcm1_graphics_prod_64.exe.7.26 15029 World Community Grid 19-11-2013 09:02 [error] Unable to verify mcm1_image02_7.26.tga using certificates 15030 World Community Grid 19-11-2013 09:02 [error] Checksum or signature error for mcm1_image02_7.26.tga 15031 World Community Grid 19-11-2013 09:02 Finished download of mcm1_image05_7.26.tga 15032 World Community Grid 19-11-2013 09:02 Started download of mcm1_image06_7.26.tga 15033 World Community Grid 19-11-2013 09:02 [error] Unable to verify mcm1_image05_7.26.tga using certificates 15034 World Community Grid 19-11-2013 09:02 [error] Checksum or signature error for mcm1_image05_7.26.tga 15035 World Community Grid 19-11-2013 09:02 [sched_op] Deferring communication for 00:07:50 15036 World Community Grid 19-11-2013 09:02 [sched_op] Reason: Unrecoverable error for task MCM1_0000099_0722_0 15037 World Community Grid 19-11-2013 09:02 [sched_op] Deferring communication for 00:11:03 15038 World Community Grid 19-11-2013 09:02 [sched_op] Reason: Unrecoverable error for task MCM1_0000099_0717_0 15039 World Community Grid 19-11-2013 09:02 [sched_op] Deferring communication for 00:30:17 15040 World Community Grid 19-11-2013 09:02 [sched_op] Reason: Unrecoverable error for task MCM1_0000099_0772_0 15041 World Community Grid 19-11-2013 09:02 [sched_op] Deferring communication for 00:32:11 15042 World Community Grid 19-11-2013 09:02 [sched_op] Reason: Unrecoverable error for task MCM1_0000099_0781_1 15043 World Community Grid 19-11-2013 09:02 [sched_op] Deferring communication for 01:24:40 15044 World Community Grid 19-11-2013 09:02 [sched_op] Reason: Unrecoverable error for task MCM1_0000099_0762_1 15045 World Community Grid 19-11-2013 09:02 [sched_op] Deferring communication for 02:14:25 15046 World Community Grid 19-11-2013 09:02 [sched_op] Reason: Unrecoverable error for task MCM1_0000099_0767_0 15047 World Community Grid 19-11-2013 09:02 Finished download of mcm1_image04_7.26.tga 15048 World Community Grid 19-11-2013 09:02 Finished download of mcm1_image06_7.26.tga 15049 World Community Grid 19-11-2013 09:02 Started download of mcm1_image07_7.26.tga 15050 World Community Grid 19-11-2013 09:02 Started download of mcm1_image08_7.26.tga 15051 World Community Grid 19-11-2013 09:02 [error] Unable to verify mcm1_image04_7.26.tga using certificates 15052 World Community Grid 19-11-2013 09:02 [error] Checksum or signature error for mcm1_image04_7.26.tga 15053 World Community Grid 19-11-2013 09:02 [error] Unable to verify mcm1_image06_7.26.tga using certificates 15054 World Community Grid 19-11-2013 09:02 [error] Checksum or signature error for mcm1_image06_7.26.tga 15055 World Community Grid 19-11-2013 09:02 Finished download of mcm1_image07_7.26.tga 15056 World Community Grid 19-11-2013 09:02 Finished download of mcm1_image08_7.26.tga 15057 World Community Grid 19-11-2013 09:02 Started download of MCM1_0000099_0762_MCM1_0000099_0762.txt 15058 World Community Grid 19-11-2013 09:02 Started download of mcm1.dataset-17_72_SDG_v1.txt 15059 World Community Grid 19-11-2013 09:02 [error] Unable to verify mcm1_image07_7.26.tga using certificates 15060 World Community Grid 19-11-2013 09:02 [error] Checksum or signature error for mcm1_image07_7.26.tga 15061 World Community Grid 19-11-2013 09:02 [error] Unable to verify mcm1_image08_7.26.tga using certificates 15062 World Community Grid 19-11-2013 09:02 [error] Checksum or signature error for mcm1_image08_7.26.tga 15063 World Community Grid 19-11-2013 09:02 Finished download of mcm1_image03_7.26.tga 15064 World Community Grid 19-11-2013 09:02 Finished download of MCM1_0000099_0762_MCM1_0000099_0762.txt 15065 World Community Grid 19-11-2013 09:02 Started download of MCM1_0000099_0772_MCM1_0000099_0772.txt 15066 World Community Grid 19-11-2013 09:02 Started download of MCM1_0000099_0722_MCM1_0000099_0722.txt 15067 World Community Grid 19-11-2013 09:02 [error] Unable to verify mcm1_image03_7.26.tga using certificates 15068 World Community Grid 19-11-2013 09:02 [error] Checksum or signature error for mcm1_image03_7.26.tga 15069 World Community Grid 19-11-2013 09:02 Finished download of MCM1_0000099_0772_MCM1_0000099_0772.txt 15070 World Community Grid 19-11-2013 09:02 Finished download of MCM1_0000099_0722_MCM1_0000099_0722.txt 15071 World Community Grid 19-11-2013 09:02 Started download of MCM1_0000099_0746_MCM1_0000099_0746.txt 15072 World Community Grid 19-11-2013 09:02 Started download of MCM1_0000099_0781_MCM1_0000099_0781.txt 15073 World Community Grid 19-11-2013 09:02 Finished download of MCM1_0000099_0746_MCM1_0000099_0746.txt 15074 World Community Grid 19-11-2013 09:02 Finished download of MCM1_0000099_0781_MCM1_0000099_0781.txt 15075 World Community Grid 19-11-2013 09:02 Started download of MCM1_0000099_0717_MCM1_0000099_0717.txt 15076 World Community Grid 19-11-2013 09:02 Started download of MCM1_0000099_0760_MCM1_0000099_0760.txt 15077 World Community Grid 19-11-2013 09:02 Finished download of MCM1_0000099_0717_MCM1_0000099_0717.txt 15078 World Community Grid 19-11-2013 09:02 Finished download of MCM1_0000099_0760_MCM1_0000099_0760.txt 15079 World Community Grid 19-11-2013 09:02 Started download of MCM1_0000099_0767_MCM1_0000099_0767.txt 15080 World Community Grid 19-11-2013 09:02 Finished download of MCM1_0000099_0767_MCM1_0000099_0767.txt 15082 World Community Grid 19-11-2013 09:02 Finished download of mcm1.dataset-17_72_SDG_v1.txt 15267 World Community Grid 19-11-2013 10:03 work fetch suspended by user Did one more cycle after disabling all security software [what a fool I am], and same went down. Then upgraded from 7.2.23 to 'newly Berkeley recommended', explicitly they version without the VM bolt-on... same repeats. Why WCG downloads fail [except for the small permanent tga/png files], for all sciences, only WCG downloads, I do not know, small sample below after last boot/install cycle, but the backup project owners are contend. 251 World Community Grid 19-11-2013 11:24 Started download of mcm1_00_v01.gif 252 World Community Grid 19-11-2013 11:24 Finished download of faah_02_v04.png 253 World Community Grid 19-11-2013 11:24 Finished download of mcm1_00_v01.gif 254 World Community Grid 19-11-2013 11:24 Started download of mcm1_01_v01.png 255 World Community Grid 19-11-2013 11:24 Started download of mcm1_02_v01.png 256 World Community Grid 19-11-2013 11:24 Finished download of mcm1_01_v01.png 257 World Community Grid 19-11-2013 11:24 Started download of mcm1_03_v01.png 258 World Community Grid 19-11-2013 11:24 Finished download of mcm1_02_v01.png 259 World Community Grid 19-11-2013 11:24 Finished download of mcm1_03_v01.png 260 World Community Grid 19-11-2013 11:24 Started download of mcm1_04_v01.png 261 World Community Grid 19-11-2013 11:24 Started download of mcm1_05_v01.png 262 World Community Grid 19-11-2013 11:24 Finished download of mcm1_04_v01.png 263 World Community Grid 19-11-2013 11:24 Started download of mcm1_06_v01.png 264 World Community Grid 19-11-2013 11:25 Finished download of mcm1_05_v01.png 265 World Community Grid 19-11-2013 11:25 Started download of mcm1_07_v01.png The few attempts earned WCG a 3:15 hours deferral... that's 24 completed tasks at SIMAP [learned they're working on an Android version while reading up over there]. Sorry WCG, but my volunteering time for today ran out. |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: |
@Sekerob - we flushed the cache on the servers and you should be able to download a clean copy now.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Doing more extreme wallowing in my ignorance, since CEP2 is a 32 bit app only anyhow, copied all permanent CEP2 files from a W7-32 backup to the W7-64 client and ran a repair on latter to ensure ownerships were set right [changed from service to user level install]. It was of course to be expected that what's in the client_state.xml and other housekeeping files would be out of sync in doing that with the project folder content. Set the client to allow 1 CEP2 task, which was promptly fetched. Rather than the boatload of perm.files, and to my larger surprise, this time just a half dozen were attempted to be downloaded, including the 2 big application files which I had manually copied over. Miss Sophie acknowledged the procedure had not changed from last time... same signature errors, and size warnings.
2386 World Community Grid 19-11-2013 22:04 Requesting new tasks for CPU 2387 World Community Grid 19-11-2013 22:04 [sched_op] CPU work request: 16923.52 seconds; 8.00 devices 2388 World Community Grid 19-11-2013 22:04 [sched_op] ATI work request: 0.00 seconds; 0.00 devices 2390 World Community Grid 19-11-2013 22:05 Scheduler request completed: got 1 new tasks 2391 World Community Grid 19-11-2013 22:05 [sched_op] Server version 701 2392 World Community Grid 19-11-2013 22:05 Project requested delay of 182 seconds 2393 World Community Grid 19-11-2013 22:05 [sched_op] estimated total CPU task duration: 88168 seconds 2394 World Community Grid 19-11-2013 22:05 [sched_op] estimated total ATI task duration: 0 seconds 2395 World Community Grid 19-11-2013 22:05 [sched_op] Deferring communication for 00:03:01 2396 World Community Grid 19-11-2013 22:05 [sched_op] Reason: requested by project 1) 2397 World Community Grid 19-11-2013 22:05 Started download of wcgrid_cep2_6.40_windows_intelx86 2) 2398 World Community Grid 19-11-2013 22:05 Started download of wcgrid_cep2_qchem_6.40_windows_intelx86 3) 2399 World Community Grid 19-11-2013 22:05 Started download of wcgrid_cep2_graphics_6.40_windows_intelx86 2400 World Community Grid 19-11-2013 22:05 Finished download of wcgrid_cep2_qchem_6.40_windows_intelx86 4) 2401 World Community Grid 19-11-2013 22:05 Started download of cep2_images_6.40.zip 2402 World Community Grid 19-11-2013 22:05 [error] File wcgrid_cep2_qchem_6.40_windows_intelx86 has wrong size: expected 63270912, got 2765615 2403 World Community Grid 19-11-2013 22:05 [error] Checksum or signature error for wcgrid_cep2_qchem_6.40_windows_intelx86 2404 World Community Grid 19-11-2013 22:05 Finished download of wcgrid_cep2_graphics_6.40_windows_intelx86 5) 2405 World Community Grid 19-11-2013 22:05 Started download of cep2_qcaux_6.40.zip 2406 World Community Grid 19-11-2013 22:05 [error] Unable to verify wcgrid_cep2_graphics_6.40_windows_intelx86 using certificates 2407 World Community Grid 19-11-2013 22:05 [error] Checksum or signature error for wcgrid_cep2_graphics_6.40_windows_intelx86 2410 World Community Grid 19-11-2013 22:05 Finished download of wcgrid_cep2_6.40_windows_intelx86 6) 2411 World Community Grid 19-11-2013 22:05 Started download of 551773785fda0ea3e0e65c5c72ad0495.zip 2412 World Community Grid 19-11-2013 22:05 [error] Unable to verify wcgrid_cep2_6.40_windows_intelx86 using certificates 2413 World Community Grid 19-11-2013 22:05 [error] Checksum or signature error for wcgrid_cep2_6.40_windows_intelx86 2414 World Community Grid 19-11-2013 22:05 Finished download of 551773785fda0ea3e0e65c5c72ad0495.zip 2416 World Community Grid 19-11-2013 22:05 update requested by user 2417 World Community Grid 19-11-2013 22:06 sched RPC pending: Requested by user 2418 World Community Grid 19-11-2013 22:06 [sched_op] Starting scheduler request 2419 World Community Grid 19-11-2013 22:06 Sending scheduler request: Requested by user. 2420 World Community Grid 19-11-2013 22:06 Reporting 1 completed tasks Have a good day [it's now 22:30 here... going off for required beautysleep] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The Clyde Brent Circus had it's final center ring act [today]. The trapeze performance by the Berkeley's Overhead Idiot Nukers Corps, in short BOINC, fell to it's demise [client uninstalled, all files erased, bootied, installed, same same], so formated the BOINC partition. First install could not complete [there's reports of crashing 7.2.28 'recommended' managers], more boot cycle, install as user of 7.0.64 and finally it took. Attached one backup project... no https... no certificate issues, crunching on 8. Added WCG... resistance. The 'default' project profile, the one to be used for new devices, was set to 1 CEP2 and 1 CEP2 only, but 8 MCM arrived. Then looked in device manager, no device, so looked in the RS page and sure enough, device with user friendly network name listed with 8 MCM... there's more devices with that name. Looked again in the device statistics ... no host, listed all/all [counted 52, not 53 I'd expected with this cleanest install, but then there was of course no validated result]. Did it pick up a device ID assigned multiple install instances ago? Injected the network suppress option in config, hit update. Opening the DM devices one by one hit on internal ID 2372334, where before this trusty host had 1854592 [oh dear why did it not recognize it, yet again?]
1854592 11/17/2013 10:29:58 12:135:03:42:23 13,172,217 28,535 Numeration backwards, my newer Linux has 2524499 . Working through my stock of lollipops fast.... very bad for the dentals too, let alone the glycemic levels. |
||
|
|