Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 38
|
![]() |
Author |
|
Mgruben
Advanced Cruncher Joined: May 26, 2013 Post Count: 94 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hey all, the below message is repeated hundreds of times in my boinccmd --get_messages output for both the work units E219825_298_K.21.C14FH9N2OSSi2.00442751.2.set1d06_4 and E219830_386_K.21.C15FH7N2OSSe.00263858.0.set1d06_3:
----------------------------------------2988: 10-Mar-2014 03:38:08 (internal error) [World Community Grid] [error] exceeded limit of 400 slot directories ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello Mgruben,
Congratulations for being the first with that error message. I suggest that you reboot and see if you can get it again. If so, please post the first 50 or so lines in your event log so that everybody can see what sort of system you have. Lawrence |
||
|
BobCat13
Senior Cruncher Joined: Oct 29, 2005 Post Count: 295 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
This really shouldn't require a reboot. The client is supposed to delete any empty slot directories upon each startup, so stopping the client and then starting it again should clean up the slots directory.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The client is supposed to reuse empty slots, suggesting there's crud in them, implying the client_state.xml and what else could be corrupted. If a client restart does not clear the situation, project reset, even a project detach add back. Of course, how on earth did it get to this state? 1 started job uses 1 slot, 1 complete job vacates the slot for reuse and if redundant, delete on restart.
|
||
|
BobCat13
Senior Cruncher Joined: Oct 29, 2005 Post Count: 295 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Don't know what caused that many slots directories, but at least it wasn't more than 12 million of them.
http://boinc.berkeley.edu/dev/forum_thread.php?id=8677 After that report, the client was set to ncpus *100 for maximum slot directories. http://lists.ssl.berkeley.edu/pipermail/boinc_dev/2013-October/020451.html |
||
|
Mgruben
Advanced Cruncher Joined: May 26, 2013 Post Count: 94 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Client cleared itself up after about an hour or so, so the post is now heavily mooted to me (unfortunately for the resolution of this mystery error). While receiving the error, however, one core lay completely idle, so it's not harmless to uptime.
----------------------------------------Note though that I only have the directories "0", "1", "2", "3," "4", and "5" in my /var/lib/boinc/slots directory, so my guess is that it may (well, must) be tallying subdirectories as well. ![]() |
||
|
petnek
Advanced Cruncher Czech Republic Joined: Mar 17, 2008 Post Count: 89 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hello,
----------------------------------------I have also error WUs. 1 on Xeon i5-1620 (32GB RAM) E219856_ 415_ K.22.C17FH9N2O2.00417287.0.set1d06_ 2-- 5 on Xeon W3530 (16GB RAM) E219875_ 298_ K.22.C16FH8N3S2.00384194.4.set1d06_ 0-- E219875_ 303_ K.22.C15FH6N3O2Se.00259838.2.set1d06_ 0-- E219875_ 302_ K.22.C16FH8N3S2.00290903.2.set1d06_ 0-- E219875_ 515_ K.22.C15FH8N3OSSi.00392895.2.set1d06_ 0-- E219852_ 850_ K.21.C17FH11N2Si.00394230.2.set1d06_ 0-- Log is aslmost same for all these WUs. Like I see, noone finished these work units yet. For everyone which crunch them is result error. <core_client_version>6.10.18</core_client_version> ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello petnek,
The work unit you show appears to have behaved perfectly normally, but has still been marked as in error. This appears to be the same problem as reported in this thread . |
||
|
Mgruben
Advanced Cruncher Joined: May 26, 2013 Post Count: 94 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
It returns
----------------------------------------3784: 25-Mar-2014 06:32:09 (internal error) [World Community Grid] [error] Can't create task for E220295_770_K.22.C18FH11OSeSi.00386875.3.set1d06_3 ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
When you look in the /boinc/slots place now, does it show that many i.e. slots/399 as the highest? If not do slots plus sub-directories there off add up to this number? As lawrenceharding commented, not seen here before your report, there's something special about your system. Is it caching the disc structures and not writing the updates to disc? Look at write to disc delays. If there's a cache-flush command in linux, run that.
|
||
|
|
![]() |