Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 11
|
![]() |
Author |
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Occasionally, but not frequently, I get stuck file uploads.
----------------------------------------If WCG aims to be an install-and-forget application for large-scale deployment, I think it needs to ensure that such glitches do not occur. BOINC should automatically detect & correct such events. System details: QX9650, Win XP-32 SP3, BOINC 6.2.19, installed as a service ("protected application") Internet connection: ADSL2+ over twisted-pair copper telephone line. Other network activity may occasionally saturate the 1023kbits/s upload capacity for short bursts. If network communications sections of BOINC have been overhauled in later versions, perhaps you can ignore this post, and I'll upgrade. Meanwhile I prefer the older UI. The upload became stuck when BOINC tried to fetch new work and returned a server access error. Below are relevant extracts from the Messages tab of the BOINC Manager, showing the start of the problem, and my later successful manual intervention to free the upload. I have placed ** at the start of lines that I consider important, and added comments. ---- ** 19/02/2011 9:00:53 AM|World Community Grid|Computation for task faah19044_ZINC05941740_xmdEq_2R5P1c_02_0 finished 19/02/2011 9:00:53 AM|World Community Grid|Starting faah19046_ZINC06006955_xmdEq_2R5P1c_00_0 19/02/2011 9:00:53 AM|World Community Grid|Starting task faah19046_ZINC06006955_xmdEq_2R5P1c_00_0 using faah version 607 19/02/2011 9:00:55 AM|World Community Grid|Started upload of faah19044_ZINC05941740_xmdEq_2R5P1c_02_0_0 ** User comment: The next file upload gets stuck: ** 19/02/2011 9:00:55 AM|World Community Grid|Started upload of faah19044_ZINC05941740_xmdEq_2R5P1c_02_0_1 ** 19/02/2011 9:00:57 AM|World Community Grid|Sending scheduler request: To fetch work. Requesting 1725 seconds of work, reporting 0 completed tasks 19/02/2011 9:01:16 AM|World Community Grid|Finished upload of faah19044_ZINC05941740_xmdEq_2R5P1c_02_0_0 19/02/2011 9:01:16 AM|World Community Grid|Started upload of faah19044_ZINC05941740_xmdEq_2R5P1c_02_0_2 ** 19/02/2011 9:01:19 AM||Project communication failed: attempting access to reference site ** 19/02/2011 9:01:20 AM||Internet access OK - project servers may be temporarily down. ** 19/02/2011 9:01:20 AM|World Community Grid|Finished upload of faah19044_ZINC05941740_xmdEq_2R5P1c_02_0_2 ** 19/02/2011 9:01:20 AM|World Community Grid|Started upload of faah19044_ZINC05941740_xmdEq_2R5P1c_02_0_3 ** 19/02/2011 9:01:23 AM|World Community Grid|Scheduler request failed: Couldn't connect to server ** 19/02/2011 9:01:24 AM|World Community Grid|Finished upload of faah19044_ZINC05941740_xmdEq_2R5P1c_02_0_3 ** 19/02/2011 9:02:24 AM|World Community Grid|Sending scheduler request: To fetch work. Requesting 1812 seconds of work, reporting 0 completed tasks ** 19/02/2011 9:02:29 AM|World Community Grid|Scheduler request succeeded: got 1 new tasks ... ** User comment: The log shows normal network activity after the error at 9:01:20am. (Lines deleted here). ... ** User comment: When I checked at 3:21pm, the BOINC Tasks tab showed faah19044_ZINC05941740_xmdEq_2R5P1c_02_0 as Uploading, ** and there were 4 tasks Ready to Report; the Transfers tab showed file faah19044_ZINC05941740_xmdEq_2R5P1c_02_0_1 ** as Uploading at 0/36kB complete. I clicked "Projects >> Update". ** Next is the last line of the log before my intervention, plus what followed: 19/02/2011 2:38:34 PM|World Community Grid|Finished upload of faah19046_ZINC06006955_xmdEq_2R5P1c_00_0_3 19/02/2011 3:21:48 PM|World Community Grid|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 4 completed tasks 19/02/2011 3:21:53 PM|World Community Grid|Scheduler request succeeded: got 0 new tasks ** User comment: The 4 uploaded tasks reported OK, but the stuck file upload remained. ** I did "Activity >> Network activity suspended", then "Transfers >> Retry Now". ** The stuck file uploaded, and I restored "Activity >> Network activity always available": 19/02/2011 3:23:04 PM||Suspending network activity - user request 19/02/2011 3:23:11 PM||Resuming network activity 19/02/2011 3:23:11 PM|World Community Grid|Started upload of faah19044_ZINC05941740_xmdEq_2R5P1c_02_0_1 19/02/2011 3:23:17 PM|World Community Grid|Finished upload of faah19044_ZINC05941740_xmdEq_2R5P1c_02_0_1 ** User comment: The WU that was previously stuck Uploading was then Ready to Report. --------- HTH - Rick [Edit 1 times, last edit by Rickjb at Feb 19, 2011 3:22:14 PM] |
||
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1673 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi Rickjb,
----------------------------------------In the past, from time to time, I did experience such upload "stucking", although the internet connection ran perfectly (on the same system, the browser ran correctly with access to internet). The workaround I found is to process as following:
I don't have any idea regarding the reason of this trouble (boinc 5.x.y). I mentioned it several times in the past (2 years ago ? or longer ? ...). During the last 6 or 8 months, I did not experience this problem again neither with boinc 5.10.45 nor with boinc 6.10.17 or 6.10.58. I did never receive an answer about this issue. Cheers, Yves |
||
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks KerSamson (Yves). I'm pleased to find that I'm not the only member to encounter this type of event. I have had these before, and I think I also reported them at the time.
If you read my post again, you will see that I was able to wake the sleeping upload by suspending network activity from the Activity menu, and then doing "Retry Now" in the Transfers tab. No crunching time was lost. If that had not worked, I would have tried stopping and re-starting BOINC. Rebooting the system would have been the next step if it was needed. In both cases where BOINC would have been stopped and restarted, WUs would have restarted from their most recent checkpoints, and crunching time would have been lost. A few lost or delayed results would not be of much direct importance to WCG because the WUs would time out and be re-issued. However, if I want to ask someone to join WCG and contribute, the software system has to have install-and-forget reliability, so having file uploads that get stuck deters me from trying to recruit new members. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Nothing much to add but to say that the all platform endorsed version of BOINC 6.10 is the .58 release and pointing to the FAQ: http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=21569
rickjb, personally, not recruiting "because of ..." is a shame. Your deep diagnostic and analytical skills are going to waste... could easily guide those new entries past the humps... to me a much more important goal we have on hand here than a few easily scalable hickups. Thanks |
||
|
Ingleside
Veteran Cruncher Norway Joined: Nov 19, 2005 Post Count: 974 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Occasionally, but not frequently, I get stuck file uploads. If WCG aims to be an install-and-forget application for large-scale deployment, I think it needs to ensure that such glitches do not occur. BOINC should automatically detect & correct such events. System details: QX9650, Win XP-32 SP3, BOINC 6.2.19, installed as a service ("protected application") Internet connection: ADSL2+ over twisted-pair copper telephone line. Other network activity may occasionally saturate the 1023kbits/s upload capacity for short bursts. If network communications sections of BOINC have been overhauled in later versions, perhaps you can ignore this post, and I'll upgrade. Meanwhile I prefer the older UI. Uploads (or downloads) getting permanently stuck is probably the biggest problem with v5.10.45, and AFAIK this bug wasn't fixed before in v6.6.xx, so v6.2.19 is also affected. Manually suspending network for some seconds and resuming should work, and any stop/start of BOINC-client will also work. Limiting network-access to some hours per day should possibly also work. But, appart for the users running Domain Controllers and therefore is stuck on v5.10.45, upgrading BOINC to v6.10.xx is better than to work-around the old network-bugs. BTW, v6.10.2x or thereabout also fixed the long-standing DNS-bug, another network-related reason for upgrading. Any older clients affected by this bug needed a re-start of BOINC-client, just disabling network didn't have any effects for this bug... ![]() "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
||
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks, Ingleside. I will upgrade
----------------------------------------![]() BTW, I think the "Retry now" button in the BOINC Transfers tab is like the Open and Close Doors buttons in building lifts (US:'"elevators"), and the pedestrians' want-to-walk buttons at traffic signals - they are not actually connected to anything, and are only there to fool you into thinking that you have some influence on outcomes ... [Edit 1 times, last edit by Rickjb at Feb 22, 2011 3:09:05 PM] |
||
|
JollyJimmy
Advanced Cruncher USA Joined: Aug 23, 2005 Post Count: 115 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I got stuck too, this morning, and I am running 6.10.58.
----------------------------------------2 result records, simply refusing to upload, regardless of how often I hit "retry now". First time ever I noticed that behavior. Then suddenly, this afternoon, they decided to upload all by themselves. So, issue resolved in a way (well, at least my issue), but clearly the culprit behind this is not just an old version! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Looked and looked and looked, and discovered that in the few cases I've had, the actual result files were corrupted (one of 5 for the task)... If they were valid, then with the extra log flags on, that Rickjb wont do, at least a diagnostic would have been possible. That's how I found out.
|
||
|
JollyJimmy
Advanced Cruncher USA Joined: Aug 23, 2005 Post Count: 115 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Coelum Non Animum Mutant, Qui Trans Mare Currunt "The sky does not change the spirit that runs across the seas" What is that supposed to mean? You an ole Navy sailor who does not particularly care about the airforce? Also, "currunt" would be plural. Shouldn't it be "currit"? Getting a bit rusty myself there... (Sorry for getting off topic, but just couldn't resist. ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Coelum Non Animum Mutant, Qui Trans Mare Currunt "The sky does not change the spirit that runs across the seas" What is that supposed to mean? You an ole Navy sailor who does not particularly care about the airforce? Also, "currunt" would be plural. Shouldn't it be "currit"? Getting a bit rusty myself there... (Sorry for getting off topic, but just couldn't resist. ![]() The meaning is more along the lines of "He who crosses the seas will change the skies, but not the soul" (Horatio).]/ot] [Edit 1 times, last edit by Former Member at Feb 23, 2011 9:11:24 AM] |
||
|
|
![]() |