Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 38
|
![]() |
Author |
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
And the winner is ... :/. Incremented the buffer in small steps to see multiple downloads, nicely sequential starts-finish one by one and the 3rd fetch has .dms MD5 download fail again. :((((
----------------------------------------237 World Community Grid 7/15/2016 6:22:39 PM Started download of 31f27d0f25eeef672a7a050808810cef.dms 238 7/15/2016 6:22:42 PM Project communication failed: attempting access to reference site 239 World Community Grid 7/15/2016 6:22:42 PM Temporarily failed download of 31f27d0f25eeef672a7a050808810cef.dms: transient HTTP error 240 World Community Grid 7/15/2016 6:22:42 PM Started download of c9bd8c8d409c1730d86a199ff040d36b.dms 241 World Community Grid 7/15/2016 6:22:43 PM Finished download of c9bd8c8d409c1730d86a199ff040d36b.dms 242 World Community Grid 7/15/2016 6:22:43 PM Started download of 95c7cf83602c87cff39d08148f097a58.rst 243 World Community Grid 7/15/2016 6:22:48 PM Finished download of 95c7cf83602c87cff39d08148f097a58.rst 244 World Community Grid 7/15/2016 6:22:48 PM Started download of e84ec1839974e6b94af96d8b1b14fbe3.inp 245 World Community Grid 7/15/2016 6:22:49 PM Finished download of e84ec1839974e6b94af96d8b1b14fbe3.inp 246 World Community Grid 7/15/2016 6:22:50 PM Started download of 31f27d0f25eeef672a7a050808810cef.dms 247 World Community Grid 7/15/2016 6:22:51 PM Finished download of 31f27d0f25eeef672a7a050808810cef.dms 248 World Community Grid 7/15/2016 6:22:51 PM [error] MD5 check failed for 31f27d0f25eeef672a7a050808810cef.dms 249 World Community Grid 7/15/2016 6:22:51 PM [error] expected 224e0e9b6bc62286cc8be9184da4aba4, got cbc5b11b53be477ca06b28ec5490e808 250 World Community Grid 7/15/2016 6:22:51 PM [error] Checksum or signature error for 31f27d0f25eeef672a7a050808810cef.dms 251 7/15/2016 6:23:03 PM Internet access OK - project servers may be temporarily down. Back to the drawing board. [Edit 1 times, last edit by SekeRob* at Jul 15, 2016 4:27:18 PM] |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Sekerob,
What is the result name that had an issue? I want to see if I can spot a trend and debug. Thanks, -Uplinger |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Here's another from start to finish:
----------------------------------------286 World Community Grid 7/15/2016 6:26:51 PM Scheduler request completed: got 1 new tasks 287 World Community Grid 7/15/2016 6:26:51 PM [sched_op] Server version 701 288 World Community Grid 7/15/2016 6:26:51 PM Project requested delay of 121 seconds 289 World Community Grid 7/15/2016 6:26:51 PM [sched_op] estimated total CPU task duration: 57467 seconds 290 World Community Grid 7/15/2016 6:26:51 PM [sched_op] Deferring communication for 00:02:01 291 World Community Grid 7/15/2016 6:26:51 PM [sched_op] Reason: requested by project 292 World Community Grid 7/15/2016 6:26:53 PM Started download of 25e79aad51dfd891238e03780192fd4b.param 293 World Community Grid 7/15/2016 6:26:54 PM Finished download of 25e79aad51dfd891238e03780192fd4b.param 294 World Community Grid 7/15/2016 6:26:54 PM Started download of 5989a05872eb742a048f500c43a45143.dat 295 World Community Grid 7/15/2016 6:26:55 PM Finished download of 5989a05872eb742a048f500c43a45143.dat 296 World Community Grid 7/15/2016 6:26:55 PM Started download of f64aaf16dc5a5f9b1d41a47add2f235e.dat 297 World Community Grid 7/15/2016 6:26:56 PM Finished download of f64aaf16dc5a5f9b1d41a47add2f235e.dat 298 World Community Grid 7/15/2016 6:26:56 PM Started download of d4b5373f693ae2e0f49118334fa15579.dat 299 World Community Grid 7/15/2016 6:26:57 PM Finished download of d4b5373f693ae2e0f49118334fa15579.dat 300 World Community Grid 7/15/2016 6:26:57 PM Started download of f2640cb4d77198704e8b89ecc2de2712.dms 301 World Community Grid 7/15/2016 6:26:58 PM Temporarily failed download of f2640cb4d77198704e8b89ecc2de2712.dms: transient HTTP error 302 World Community Grid 7/15/2016 6:26:58 PM Started download of 28dacd144408636f6a550886dd9e2d5f.dms 303 World Community Grid 7/15/2016 6:26:59 PM [checkpoint] result ugm1_ugm1_26647_0097_0 checkpointed 304 World Community Grid 7/15/2016 6:26:59 PM [checkpoint] result ugm1_ugm1_26648_1972_1 checkpointed 305 World Community Grid 7/15/2016 6:26:59 PM [checkpoint] result ugm1_ugm1_26703_0392_3 checkpointed 306 7/15/2016 6:26:59 PM Project communication failed: attempting access to reference site 307 World Community Grid 7/15/2016 6:26:59 PM Finished download of 28dacd144408636f6a550886dd9e2d5f.dms 308 World Community Grid 7/15/2016 6:26:59 PM Started download of 53236c46d506b26bced68f39b09ac989.rst 309 World Community Grid 7/15/2016 6:27:00 PM [checkpoint] result ugm1_ugm1_26648_1022_0 checkpointed 310 World Community Grid 7/15/2016 6:27:00 PM [checkpoint] result ugm1_ugm1_26648_1922_0 checkpointed 311 7/15/2016 6:27:00 PM Internet access OK - project servers may be temporarily down. 312 World Community Grid 7/15/2016 6:27:01 PM Finished download of 53236c46d506b26bced68f39b09ac989.rst 313 World Community Grid 7/15/2016 6:27:01 PM Started download of 6128d410abf034e871a2e87336d2d778.inp 314 World Community Grid 7/15/2016 6:27:02 PM [checkpoint] result ugm1_ugm1_26648_0691_0 checkpointed 315 World Community Grid 7/15/2016 6:27:02 PM Finished download of 6128d410abf034e871a2e87336d2d778.inp 316 World Community Grid 7/15/2016 6:27:03 PM Started download of f2640cb4d77198704e8b89ecc2de2712.dms 317 World Community Grid 7/15/2016 6:27:04 PM Finished download of f2640cb4d77198704e8b89ecc2de2712.dms 318 World Community Grid 7/15/2016 6:27:04 PM [error] MD5 check failed for f2640cb4d77198704e8b89ecc2de2712.dms 319 World Community Grid 7/15/2016 6:27:04 PM [error] expected e6f84a45f61aa74af1f20a8115d40701, got 45cc159e2751c0d01109ea46e73ed872 320 World Community Grid 7/15/2016 6:27:04 PM [error] Checksum or signature error for f2640cb4d77198704e8b89ecc2de2712.dms 321 World Community Grid 7/15/2016 6:27:11 PM [sched_op] Deferring communication for 00:01:57 322 World Community Grid 7/15/2016 6:27:11 PM [sched_op] Reason: Unrecoverable error for task FAH2_000122_avx175573cs-ls_000099_0001_013_0 Now running on the latest available build, 7.6.33 [which does not have that last Dr.A check-in included v.v. the multiple async downloads using same tempfile name.] The rate seems to be around 50%. Config is such that just one task per fetch is received, lest I go wild on buffer incrementing. Edit: Windows 10 64 bit 14372Rc1 build, client 64 bit 7.6.33, service install. [Edit 1 times, last edit by SekeRob* at Jul 15, 2016 5:22:23 PM] |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for the information. Time for me to start digging :)
Thanks, -Uplinger |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
So, that file is a CDN file. I have checked on our local servers and the md5sum matches what is sent in the xml doc. I am checking to see if all the CDN servers are populating it properly. I will be running that test against multiple files and could take some time.
Thanks, -Uplinger |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
GLGH
Odd log differential between one with the 7.6.2 and 7.6.33 clients... under former an actual task name, the latter a hash of some sort: Result Name: FAH2_ 000098_ avx38789_ 000023_ 0017_ 028_ 0-- <core_client_version>7.6.2</core_client_version> <![CDATA[ <message> WU download error: couldn't get input files: <file_xfer_error> <file_name>fahb.FAH2_000098_avx38789_000023-in1.dms</file_name> <error_code>-119 (md5 checksum failed for file)</error_code> <error_message>MD5 check failed</error_message> </file_xfer_error> </message> ]]> Result Name: FAH2_ 000122_ avx175573cs-ls_ 000099_ 0001_ 013_ 0-- <core_client_version>7.6.33</core_client_version> <![CDATA[ <message> WU download error: couldn't get input files: <file_xfer_error> <file_name>f2640cb4d77198704e8b89ecc2de2712.dms</file_name> <error_code>-119 (md5 checksum failed for file)</error_code> <error_message>MD5 check failed</error_message> </file_xfer_error> </message> ]]> IIRC, it is the task name length that causes this different treatment, but if this is the real reason? |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
So, that file is a CDN file. I have checked on our local servers and the md5sum matches what is sent in the xml doc. I am checking to see if all the CDN servers are populating it properly. I will be running that test against multiple files and could take some time. Thanks, -Uplinger That was I think tonyh's suspicion... something in the CDN or at least a differential between what's in the CDN, and what's coming out of source when falling back. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Thanks, SekeRob, for raising that. In case it got lost in the noise, adding here the details from that observation, with the file_xfer_debug log flag set for an <error_code>-120 (RSA key check failed for file), as reported in the Beta Test for Help Stop TB - June 17, 2016 issues thread, that:
the download URLs are different for the 2 attempts at downloading wcgrid_beta22_gromacs_7.21_windows_intelx86. The first is via the CDN URL: http://bdd7.http.cdn.softlayer.net/80BDD7/gri...acs_7.21_windows_intelx86 whereas the second is direct URL: https://grid.worldcommunitygrid.org/boinc/dow...acs_7.21_windows_intelx86 Could this changeover during the download be causing the error? |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
So, the hash file is expected. If a file name is longer than i think 50 characters, then we hash it. It is a similar file to what you were seeing before and has nothing to do with the client being updated. It is something done on the server side before the data gets placed in the workunit and result table for you to download.
I am probably going to re-evaluate using the cdn for the dms files. They used to fit the requirements for CDN previously and with the current builds of workunits from the researchers, I don't believe they fit the requirements to use the CDN. Thanks, -Uplinger |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Greetings again,
I am going to work towards removing the DMS files from having a requirement to use the CDN. This will fix a large number of the -186 errors. Thanks, -Uplinger |
||
|
|
![]() |