Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 171
|
![]() |
Author |
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3295 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Is checkpointing working? Edit: Okay.. after 20 minutes of CPU time, no checkpoint and progress went from nearly 10% to 2%.... and seems to be stuck there. Edit 2: After 5 CPU minutes stuck at 2%, progress increased to 4%. No checkpoint.... 64-bit Linux I do not run Linux but this sounds like normal behaviour checkpoints are not at regular intervals Just seems way too long. Almost 40 CPU minutes, no checkpoint and still at 4%. AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
marist_college
Advanced Cruncher USA Joined: Mar 30, 2005 Post Count: 107 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi Uplinger,
----------------------------------------Can either of you access this file? http://swift.worldcommunitygrid.org/v1/AUTH_0...6/beta26_image05_7.08.tga Yes, it downloaded ok, but isn't viewable as an image (picture)...I'm assuming that's ok? If you have access to a linux machine on the same network, can you try this command? curl -Ov http://swift.worldcommunitygrid.org/v1/AUTH_0...6/beta26_image05_7.08.tga Used the same machine with the Windows 10 Ubuntu on Windows feature. Here's the output: % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 173.192.119.113... * Connected to swift.worldcommunitygrid.org (173.192.119.113) port 80 (#0) > GET /v1/AUTH_02593dc3-da28-4635-a1c8-8cc5e6e3772a/beta26/beta26_image05_7.08.tga HTTP/1.1 > Host: swift.worldcommunitygrid.org > User-Agent: curl/7.47.0 > Accept: */* > < HTTP/1.1 200 OK < Content-Length: 66708 < Accept-Ranges: bytes < Last-Modified: Fri, 07 Jul 2017 18:50:59 GMT < Etag: 0a6dcb92d8ef4615cae514388e5bbd46 < X-Timestamp: 1499453458.51358 < Content-Type: application/x-www-form-urlencoded < X-Trans-Id: tx5838c127bff3487586323-00597bbe4a < Date: Fri, 28 Jul 2017 22:44:26 GMT < { [11973 bytes data] 100 66708 100 66708 0 0 254k 0 --:--:-- --:--:-- --:--:-- 254k * Connection #0 to host swift.worldcommunitygrid.org left intact We are seeing some new errors today. Files download but MD5 hash doesn't match. Here's an example of that: <core_client_version>7.6.33</core_client_version> <![CDATA[ <message> WU download error: couldn't get input files: <file_xfer_error> <file_name>0befbaafffda7df7d78f1412272f4147.2</file_name> <error_code>-119 (md5 checksum failed for file)</error_code> </file_xfer_error> </message> ]]> Edit 1: replaced output of curl with correct output after using the full URL and not the truncated version ![]() [Edit 1 times, last edit by marist_college at Jul 28, 2017 10:46:09 PM] |
||
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1677 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Received 3 WUs on 3 different hosts; all valid.
----------------------------------------However, based on the limited number of computed WUs, following can be observed regarding the granted credit per hour:
Since I do not have more WUs available, I cannot improve the observation for identifying a pattern: Win vs. Linux or i7 vs. Phenom II/Athlon II. In all cases, the consistency of the granted credits should be improved for this new science. Cheers, Yves |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Yves, you CAN NOT compare VINA science production on Linux, with any other non-VINA science. Think we know that by now, these OET and brethren process ON LINUX twice as fast thus yield much more credit per hour..
----------------------------------------[Edit 1 times, last edit by SekeRob* at Jul 29, 2017 6:33:29 AM] |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
hmmm, not for the weak hearted, got 8, and after 8:26 hours, the first has checkpointed 8 times, with 12 minutes past the last (BOINCTasks is great at monitoring chkpnt counts and keeping taps on 'when last'). Cycling an 8 core machine would come with costly progress price. Of course, setting to suspend them at the next checkpoint gives equal loss... no crunching till the last one suspends, as costly.
----------------------------------------After 1 hour processing it said it was heading for 15 hours, but this morning it said it would complete at about 9:30. CEP2 like prediction. [Edit 1 times, last edit by SekeRob* at Jul 29, 2017 7:15:58 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Only 10 structures in batch 53, but they must be large structures, so with checkpointing only at the completion of each structure, it's a long interval between them. Those on my 8 core are 10 hours CPU done, 2 to 3 hours estimated remaining. However, batch 17 just loaded on another machine are also 10 structures but only 3 hrs estimated to completion. (All with <fraction_done_exact/>).
|
||
|
duanebong
Advanced Cruncher Singapore Joined: Apr 25, 2009 Post Count: 134 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Just seems way too long. Almost 40 CPU minutes, no checkpoint and still at 4%. I experienced the same long checkpoints. After 4% it will jump to 6% after about an hour run time. Just need to be patient and give it time to run. Some of the WUs took up to 16hrs to run, but completed successfully in the end. ![]() |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
The whole 2% incrementing could be artificial to get a feel of progress... sort of TTC 10 hours, 10 structures, thus 1 structure is 10%, no matter how long each structure takes.
This so has the odor of HPF3 (but a different name I'd vote for, as the old HPF has some legacy) |
||
|
TonyEllis
Senior Cruncher Australia Joined: Jul 9, 2008 Post Count: 261 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
duanebong commented...
---------------------------------------- Some of the WUs took up to 16hrs to run, but completed successfully in the end. That's quick by comparison to the old atoms here... ![]() BETA_ beta26_ 00000026_ 0347_ 0-- violetta.sraellis.com Valid 7/25/17 19:13:06 7/27/17 20:37:28 46.43 / 49.35 132.7 / 132.7 BETA_ beta26_ 00000026_ 0346_ 0-- violetta.sraellis.com Valid 7/25/17 19:13:05 7/27/17 19:57:25 45.82 / 48.71 130.9 / 130.9 BETA_ beta26_ 00000026_ 0345_ 0-- violetta.sraellis.com Valid 7/25/17 19:13:05 7/27/17 17:32:17 43.55 / 46.29 124.4 / 124.4 Naturally quite some time passes between checkpoints...
Run Time Stats https://grassmere-productions.no-ip.biz/
----------------------------------------[Edit 1 times, last edit by TonyEllis at Jul 29, 2017 9:00:18 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Here's a strange problem with a wingman (noticed when I received the repair job). The only oddity I can see is "INFO: Could not determine result number" (so it was set to 15?); it should have been 0. FWIW, I think 06.02.9200.00 denotes Windows 8.
BETA_ beta26_ 00000053_ 0308_ 2-- Microsoft Windows 10 Professional x64 Edition, (10.00.14393.00) - In Progress 7/29/17 10:21:35 8/2/17 10:21:35 0.00 0.0 / 0.0 BETA_ beta26_ 00000053_ 0308_ 1-- Microsoft Windows 8.1 x64 Edition, (06.03.9600.00) - In Progress 7/29/17 10:21:34 8/2/17 10:21:34 0.00 0.0 / 0.0 BETA_ beta26_ 00000053_ 0308_ 0-- Microsoft x86 Edition, (06.02.9200.00) 708 Invalid 7/28/17 21:15:38 7/29/17 10:21:25 6.89 172.9 / 0.0 Result Log Result Name: BETA_ beta26_ 00000053_ 0308_ 0-- <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> [2017- 7-29 3:56: 6:] :: BOINC:: Initializing ... ok. [2017- 7-29 3:56: 6:] :: BOINC :: boinc_init() INFO: Could not determine result number INFO: result number = 15 BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. command: projects/www.worldcommunitygrid.org/wcgrid_beta26_rosetta_7.08_windows_intelx86 -in::file::zip beta26_databasev2.zip @./beta26_00000053.flags -out::file::silent result_silent.out -run:jran 1092152105 -nstruct 10 -out::level 100 -run::no_scorefile true Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Unpacking zip data: ../../projects/www.worldcommunitygrid.org/beta26.beta26_databasev2.zip Setting database description ... Setting up checkpointing ... abrelax ... abrelax.run Setting up folding (abrelax) ... Beginning folding (abrelax) ... BOINC:: Worker startup. Starting work on structure: _0001 Finished _0001 in 2722.14 seconds. Starting work on structure: _0002 Finished _0002 in 2856.33 seconds. Starting work on structure: _0003 Finished _0003 in 2528.64 seconds. Starting work on structure: _0004 Finished _0004 in 2357.17 seconds. Starting work on structure: _0005 Finished _0005 in 2604.7 seconds. Starting work on structure: _0006 Finished _0006 in 1323.58 seconds. Starting work on structure: _0007 Finished _0007 in 2596.16 seconds. Starting work on structure: _0008 Finished _0008 in 2829.11 seconds. Starting work on structure: _0009 Finished _0009 in 2282.36 seconds. Starting work on structure: _0010 Finished _0010 in 2644.81 seconds. ====================================================== DONE :: 10 structures in 24794.9 cpu seconds ====================================================== BOINC :: BOINC support services shutting down cleanly ... 10:52:01 (8840): called boinc_finish(0) </stderr_txt> ]]> |
||
|
|
![]() |