World Community Grid - View Thread - New Beta Test - July 21, 2017 [ Issues Thread ]

World Community Grid Forums

Category: Beta Testing

Forum: Beta Test Support Forum

Thread: New Beta Test - July 21, 2017 [ Issues Thread ]

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 171

[ ]

Author

This topic has been viewed 23213 times and has 170 replies

slakin
Advanced Cruncher
Joined: Jul 4, 2008
Post Count: 79
Status: Offline
Project Badges:

180 day badge for Human Proteome Folding - Phase 2

14 day badge for Discovering Dengue Drugs - Together

90 day badge for Nutritious Rice for the World

14 day badge for The Clean Energy Project

2 year badge for Help Fight Childhood Cancer

2 year badge for Help Cure Muscular Dystrophy - Phase 2

180 day badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for The Clean Energy Project - Phase 2

1 year badge for Computing for Clean Water

1 year badge for Drug Search for Leishmaniasis

1 year badge for GO Fight Against Malaria

180 day badge for Computing for Sustainable Water

200 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

50 year badge for Outsmart Ebola Together

50 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

50 year badge for Microbiome Immunity Project

45 day badge for Africa Rainfall Project

20 year badge for OpenPandemics - COVID-19


Re: New Beta Test - July 21, 2017 [ Issues Thread ]

Had a work unit error out on a windows XP machine, not to fuel the debate as I can always run another project on this old machine. Here is the log.

Result Log

Result Name: BETA_ beta26_ 00000049_ 0025_ 0--
<core_client_version>7.2.47</core_client_version>
<![CDATA[
<message>
app_version download error: couldn't get input files:
<file_xfer_error>
<file_name>wcgrid_beta26_gfx_7.08_windows_intelx86</file_name>
<error_code>-224 (permanent HTTP error)</error_code>
<error_message>permanent HTTP error</error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>beta26_image01_7.08.tga</file_name>
<error_code>-224 (permanent HTTP error)</error_code>
<error_message>permanent HTTP error</error_message>
</file_xfer_error>

</message>
]]>

[Aug 4, 2017 12:57:36 AM]

duanebong
Advanced Cruncher
Singapore
Joined: Apr 25, 2009
Post Count: 134
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

2 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

1 year badge for Computing for Sustainable Water

20 year badge for Mapping Cancer Markers

10 year badge for Outsmart Ebola Together

10 year badge for FightAIDS@Home - Phase 2

20 year badge for Microbiome Immunity Project

20 year badge for Africa Rainfall Project

50 year badge for OpenPandemics - COVID-19


Re: New Beta Test - July 21, 2017 [ Issues Thread ]

I had 3 beta WUs give errors on an Asus Ultrabook running Windows 7 SP1 x64. So at least for this instance it looks like the (permanent HTTP error) is not related to whether it is running on Windows XP.

<core_client_version>7.4.22</core_client_version>
<![CDATA[
<message>
app_version download error: couldn't get input files:
<file_xfer_error>
<file_name>wcgrid_beta26_rosetta_7.08_windows_intelx86</file_name>
<error_code>-224 (permanent HTTP error)</error_code>
<error_message>permanent HTTP error</error_message>
</file_xfer_error>

</message>
]]>

----------------------------------------

----------------------------------------
[Edit 1 times, last edit by duanebong at Aug 5, 2017 9:54:00 AM]

[Aug 5, 2017 8:20:24 AM]

armstrdj
Former World Community Grid Tech
Joined: Oct 21, 2004
Post Count: 695
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

14 day badge for Help Cure Muscular Dystrophy

90 day badge for Discovering Dengue Drugs - Together

90 day badge for The Clean Energy Project

90 day badge for Influenza Antiviral Drug Search

2 year badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for Computing for Sustainable Water

10 year badge for Mapping Cancer Markers

2 year badge for Uncovering Genome Mysteries

2 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

2 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project

2 year badge for OpenPandemics - COVID-19


Re: New Beta Test - July 21, 2017 [ Issues Thread ]

There is a batch of work being made available now of the second type of workunit that will be run for this project. If it runs well we will load more batches of that type later. Again if you are experiencing any file transfer or download errors in beta please turn http_debug on in your client and post the event log when the issue occurs. https://boinc.berkeley.edu/wiki/Client_configuration
Thanks,
armstrdj

[Aug 8, 2017 7:15:45 PM]

slakin
Advanced Cruncher
Joined: Jul 4, 2008
Post Count: 79
Status: Offline
Project Badges:


Re: New Beta Test - July 21, 2017 [ Issues Thread ]

I had a wu download failure, this time on a windows 10 machine ..here is the log.

Result Name: BETA_ beta26_ 00000056_ 1563_ 0--
<core_client_version>7.2.47</core_client_version>
<![CDATA[
<message>
app_version download error: couldn't get input files:
<file_xfer_error>
<file_name>wcgrid_beta26_rosetta_7.10_windows_intelx86</file_name>
<error_code>-224 (permanent HTTP error)</error_code>
<error_message>permanent HTTP error</error_message>
</file_xfer_error>

</message>
]]>

Per your update above, if I can figure out how :-), I will turn on http_debug and see if I can capture an error.

[Aug 8, 2017 7:22:10 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: New Beta Test - July 21, 2017 [ Issues Thread ]

With the current units, the first checkpoint has occurred when the second structure completed, not the first as previously. Is that the intended behaviour?

With one checkpoint having occurred, the stderr file ends with:

Setting up checkpointing ...
BOINC:: Worker startup.
Starting job S_0001
Finished job S_0001 in 1465.83 seconds
Starting job S_0002
Finished job S_0002 in 1439.67 seconds
Starting job S_0003

[Aug 8, 2017 9:09:00 PM]

armstrdj
Former World Community Grid Tech
Joined: Oct 21, 2004
Post Count: 695
Status: Offline
Project Badges:


Re: New Beta Test - July 21, 2017 [ Issues Thread ]

A checkpoint should be attempted after every structure is completed but whether or not it takes one will depend on your value for write to disk. I checked the results that are back and saw several examples of runs that took a checkpoint after the first structure was computed and restarted on the second structure. I will take a look at the code to make sure the right calls are being made to signal the checkpoint was taken.

Thanks,
armstrdj

[Aug 9, 2017 2:38:21 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: New Beta Test - July 21, 2017 [ Issues Thread ]

Jonathan, that machine has checkpoint to disk at most every 300 sec, so much less than the 1465.83 sec to complete structure 1. For checkpointing on this occasion, I was going by the task properties and the content of the boinc_checkpoint_count.txt file in the slots folder.

[Aug 9, 2017 3:07:44 PM]

SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline


Re: New Beta Test - July 21, 2017 [ Issues Thread ]

It's also, if your app is flagged to listen to the client desire to write to disk more or less than what the science app is compiled to do, ergo, every structure is the logical point, but 1500 seconds seems to be a long time, the slower the device the longer, and I've seen 1 hour+ on a 3Hgz device. If you run such a science on an 8 / 16 / 32 threaded device that's lots of lost time for every time a client is restarted. Better tell people to switch on 'keep in memory when suspended' for those that actually 'use' their computer, and do not wish to crunch during that 'use' time or you'll encounter another walk away uproar.

[Aug 9, 2017 3:20:15 PM]

Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1320
Status: Offline
Project Badges:

1 year badge for Nutritious Rice for the World

20 year badge for Outsmart Ebola Together

20 year badge for FightAIDS@Home - Phase 2

5 year badge for Microbiome Immunity Project

5 year badge for Africa Rainfall Project


Re: New Beta Test - July 21, 2017 [ Issues Thread ]

I got 14 beta's on a Linux machine. All errors with signal 11 after ~10 seconds

Example:

BETA_ beta26_ 00000057_ 0129_ 0--
<core_client_version>7.4.22</core_client_version>
<![CDATA[
<message>
process got signal 11
</message>
<stderr_txt>
[2017- 8- 9 17:36:22:] :: BOINC:: Initializing ... ok.
[2017- 8- 9 17:36:22:] :: BOINC :: boinc_init()
INFO: result number = 0
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
command: ../../projects/www.worldcommunitygrid.org/wcgrid_beta26_rosetta_7.10_x86_64-pc-linux-gnu -in::file::zip beta26_databasev2.zip @./beta26_00000057.flags -out::file::silent result_silent.out -run:jran 2069786245 -nstruct 10 -out::level 100 -run::no_scorefile true
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/www.worldcommunitygrid.org/beta26.beta26_databasev2.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
set_shared_memory_fully_initialized ...
BOINC:: Worker startup.
Starting job S_0001

</stderr_txt>

On a Windows 7 machine they're starting well.

[Aug 9, 2017 4:01:49 PM]

gb009761
Master Cruncher
Scotland
Joined: Apr 6, 2005
Post Count: 2982
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

90 day badge for Help Cure Muscular Dystrophy

1 year badge for Help Cure Muscular Dystrophy - Phase 2

1 year badge for Discovering Dengue Drugs - Together - Phase 2

1 year badge for The Clean Energy Project - Phase 2

180 day badge for Drug Search for Leishmaniasis

180 day badge for GO Fight Against Malaria

5 year badge for OpenPandemics - COVID-19


Re: New Beta Test - July 21, 2017 [ Issues Thread ]

Hurray, this afternoon I received 2 WU's - one on each machine. As soon as I arrive home, I'll force them to the front of the queue.

----------------------------------------

----------------------------------------
[Edit 1 times, last edit by gb009761 at Aug 9, 2017 5:51:09 PM]

[Aug 9, 2017 5:50:33 PM]

[ ]