Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 7
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 563 times and has 6 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
BOINC in BrandZ: some sort of error

Hello,
I'm trying to run BOINC for WCG on my Solaris/x86 machine, and not having much luck. I first tried the solaris-i686 client and it reported that it couldn't attach to WCG because there are no clients available for that OS. Fair enough. So I installed the Centos 3 linux image found here. After logging into that, 'uname -a' shows this:
Linux linux 2.4.21 BrandZ fake linux i686 i686 i386 GNU/Linux
However, when I ran BOINC I got lots of errors like this:
2007-05-26 11:10:44 [World Community Grid] Restarting task lc249_00009_1 using hpf2 version 519
2007-05-26 11:10:46 [World Community Grid] Task lc249_00009_1 exited with zero status but no 'finished' file
2007-05-26 11:10:46 [World Community Grid] If this happens repeatedly you may need to reset the project.

I installed BOINC and left it running overnight. This morning I found thousands of these messages printed to the console. Not knowing exactly what "reset the project" means, I nuked a subdir of the projects directory and tried again:
# rm -rf projects/www.worldcommunitygrid.org/*
-bash-2.05b# ./boinc -attach_project www.worldcommunitygrid.org 8208ab8805f6e37c2c267f6c3b3fa82e
2007-05-26 10:58:58 [---] Starting BOINC client version 5.8.16 for i686-pc-linux-gnu
2007-05-26 10:58:58 [---] log flags: task, file_xfer, sched_ops
2007-05-26 10:58:58 [---] Libraries: libcurl/7.16.0 OpenSSL/0.9.8d zlib/1.2.3
2007-05-26 10:58:58 [---] Data directory: /root/BOINC
2007-05-26 10:58:58 [---] Processor: 2 GenuineIntel Intel(r) Pentium(r) III [Family 6 Model 8 Stepping 6][fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall]
2007-05-26 10:58:58 [---] Memory: 1.99 GB physical, 517.71 MB virtual
2007-05-26 10:58:58 [---] Disk: 65.53 GB total, 44.35 GB free
2007-05-26 10:58:58 [---] Already attached to http://www.worldcommunitygrid.org/
2007-05-26 10:58:58 [World Community Grid] URL: http://www.worldcommunitygrid.org/; Computer ID: 198025; location: (none); project prefs: default
2007-05-26 10:58:58 [---] General prefs: from World Community Grid (last modified 1969-12-31 19:00:01)
2007-05-26 10:58:58 [---] Host location: none
2007-05-26 10:58:58 [---] General prefs: using your defaults

So far everything looks okay, yes? It went and downloaded a bunch of files, and then went back into the die/restart loop (omitted for space reasons, there's about 300 relevant lines of log). The processor flags are detected properly, that's the right amount of memory, and so forth.

So. What's going wrong? Is there a log file generated by the client anywhere? Is there another file I should show you the contents of? I haven't used the BOINC agent in Linux before, so I'm stumped by this.

TIA for help!
[May 26, 2007 3:36:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC in BrandZ: some sort of error

unhappy_mage,

see no Linux expert on the board to answer atm, so here some generals: do these messages appear every 60 seconds, or what interval? BOINC writes to the program log files regularly, but the state file sometimes gets up to 10 hits per second. Possibly write permissions need elevating?

If the OS to system clock synchronisation happens this 'warning' appears, but that should only occur a few times in a given day. BOINC does quite an exact time keeping between the Core Client and the science and hates it when the system clock is off even for a fraction doing that. I've disabled the OS function on one machine and now it's about 2 minutes behind after several months.... cant' be bothered.

log files are in the BOINC program dir,start with STDxxxxx and end with a txt extension. The message tab, if using the Boinc Manager GUI, shows all current session activity and anything beyond, including current session is written to those log files.

cheers
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[May 26, 2007 5:05:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC in BrandZ: some sort of error

The messages appear much more frequently than that, more like every 3 seconds. However, I think I've found the problem: in BOINC/slots/1/stderr.txt, it complains of libstdc++.so.6 missing. I'll see if I can find the proper RPM package and report back.
[May 26, 2007 5:47:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC in BrandZ: some sort of error

cat slots/0/stderr.txt
INFO: wcg_faah_autodock_5.30_i686-pc-linux-gnu Start AutoGrid...
About to call graphics init
INFO:[13:44:48] Start AutoGrid...

autogrid: autogrid4: Successful Completion.
Starting to checkpoint ...
Checkpoint complete
INFO:[13:49:20] End AutoGrid...
Beginning AutoDock...
Virtual File wcg_autodock4.dlg resolves as physical file ../../projects/www.worldcommunitygrid.org/faah1643_ZINC04343264_x2AZ8_00_1_1
INFO: Setting num_generations: 27000
Setting maxGen to 6750
autodock4: WARNING: Unrecognized keyword in docking parameter file, in line:
compute_unbound_extended # compute extended ligand energyINFO: No state to restore. Start from the beginning.
About to enter main loop...(dockings already completed: 0)
call_glss(): pop_size: 200 num_evals: 10000000 start: [13:49:30]
_maxGenSeenSoFar changed: 6750


This looks like it was working...

cat slots/1/stderr.txt
set_worker_timer(): pthread_create(): 11[ERROR] Initializing BOINC failed. BOINC error: 11: Error 11
set_worker_timer(): pthread_create(): 11[ERROR] Initializing BOINC failed. BOINC error: 11: Error 11
...


but this definitely wasn't. Any way I could tell which project was working and which failing? Even if I have to disable one project for a while while I figure this out, it's still more than zero work getting done...
[May 26, 2007 5:56:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC in BrandZ: some sort of error

Hi,

looked up that missing lib.... quite a few versions floating around.

The 1st block of messages are benign, error 11 likely not. With the missing lb fix, you might want to do a project reset as the slot may have gotten corrupted. If you know which job went into what slot, you could abort the particular work unit.

The next release, version 5.10 will have improved slot cleaning management.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[May 26, 2007 6:08:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC in BrandZ: some sort of error

I found libstdc++ for Centos that seems to have installed fine. That part I'm not worried about now.

Any idea what kind of thread library is needed? boinc seems to have all the libs it needs:
ldd boinc
libdl.so.2 => /lib/libdl.so.2 (0xce820000)
libc.so.6 => /lib/tls/libc.so.6 (0xce6d5000)
libm.so.6 => /lib/tls/libm.so.6 (0xce6b1000)
libpthread.so.0 => /lib/tls/libpthread.so.0 (0xce69f000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0xceb67000)

How do I do a project reset? My first inclination would be remove all the files in /slots/1, but if there's a guide on how to do this I'd appreciate a link.
[May 26, 2007 6:20:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC in BrandZ: some sort of error

With the libs I cant help you.... just read that certain BOINC versions expect certain libraries. Here 2 lists from a oogle search:
http://rpm.pbone.net/index.php3/stat/3/srodza...ibstdc++.so.6(GLIBCXX_3.4)
http://rpmfind.net/linux/rpm2html/search.php?query=libstdc%2B%2B.so.6

If you start up the BOINCmgr, in the Tasks tab select the suspect work unit(s) and operate Abort button in left margin. For a general reset, go to the Project tab, select WCG and hit Reset Project button.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[May 26, 2007 6:37:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread