Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 9
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3748 times and has 8 replies Next Thread
PMH_UK
Veteran Cruncher
UK
Joined: Apr 26, 2007
Post Count: 769
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
3 copies for Quorum 2 Replication 2

Below workunit had 3 copies issued even though it is quorum 2 replication 2 (mine was the server abort).
I have seen this before with other projects.
Not a problem but could waste CPU if all 3 started before 2 returned.

Workunit Status

Project Name: OpenPandemics - COVID 19
Created: 05/21/2020 01:24:43
Name: OPN1_0000977_05646
Minimum Quorum: 2
Replication: 2


Result Name OS type OS version App Version Number Status Sent Time Time Due /
Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit
OPN1_ 0000977_ 05646_ 2-- Linux Ubuntu Ubuntu 18.04.3 LTS [4.15.0-66-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)] 717 Valid 5/22/20 16:40:27 5/23/20 14:46:37 3.72 84.6 / 83.3
OPN1_ 0000977_ 05646_ 1-- Linux 4.4.0-179-generic 717 Server Aborted 5/22/20 16:27:21 5/23/20 14:52:04 0.00 76.7 / 0.0
OPN1_ 0000977_ 05646_ 0-- Linux CentOS Linux CentOS Linux 7 (Core) [3.10.0-1062.18.1.el7.x86_64|libc 2.17 (GNU libc)] 717 Valid 5/22/20 16:26:54 5/22/20 19:28:03 1.80 82.1 / 83.3

Paul.
----------------------------------------
Paul.
[May 23, 2020 5:10:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: 3 copies for Quorum 2 Replication 2

You're not showing the timestamp each went out, but presume very close. At 16:26.54 the server realized there was a redundant copy, so word went out to maybe both at 16:27. Had both responded started, the end, had both responded, OK, a _3 would likely have gone out. Techs might put this away as some race condition, maybe 1 of the first 2 was not fast enough acknowledging receipt why _2 went out. Sheet happens, and the program took it's course with 1,087,198 validated yesterday. That's a lot per second, as uplinger noted yesterday, doing 1000 validations per minute when there's backlog.
[May 23, 2020 5:51:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2153
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 3 copies for Quorum 2 Replication 2

Below workunit had 3 copies issued even though it is quorum 2 replication 2 (mine was the server abort).
I have seen this before with other projects.
Not a problem but could waste CPU if all 3 started before 2 returned.

Workunit Status

Project Name: OpenPandemics - COVID 19
Created: 05/21/2020 01:24:43
Name: OPN1_0000977_05646
Minimum Quorum: 2
Replication: 2


Result Name            OS           Status         Sent Time         Due / Return Time CPUh  Claimed/Gr.
OPN1_0000977_05646_2-- Linux Ubuntu Valid 5/22/20 16:40:27 5/23/20 14:46:37 3.72 84.6/83.3
OPN1_0000977_05646_1-- Linux Server Aborted 5/22/20 16:27:21 5/23/20 14:52:04 0.00 76.7/0.0
OPN1_0000977_05646_0-- Linux CentOS Valid 5/22/20 16:26:54 5/22/20 19:28:03 1.80 82.1/83.3
[Generated by wcgformat]

I'm seeing this, too, Paul. (BTW, your timestamps are OK. Your text was nicely formatted by my wcgformat computer program.)
Result Name            OS           Status Sent Time         Due / Return Time CPUh  Claimed/Granted
OPN1_0000543_12220_2-- Linux Ubuntu Valid 5/22/20 16:36:21 5/23/20 00:50:07 2.39 87.0/86.5
OPN1_0000543_12220_1-- Linux Fedora Valid 5/22/20 13:44:47 5/23/20 03:08:56 2.65 83.4/86.5
OPN1_0000543_12220_0-- Linux CentOS Valid 5/22/20 13:44:20 5/22/20 20:17:16 1.94 86.0/86.5
[Generated by wcgformat]

Project Name: OpenPandemics - COVID 19
Created: 05/20/2020 23:24:57
Name: OPN1_0000543_12221
Minimum Quorum: 2
Replication: 2
Result Name            OS           Status         Sent Time         Due / Return Time CPUh  Claimed/Granted
OPN1_0000543_12221_2-- Linux Debian Valid 5/22/20 16:35:55 5/23/20 22:19:40 4.97 84.3/81.2
OPN1_0000543_12221_1-- Linux Fedora Valid 5/22/20 13:44:47 5/23/20 09:59:03 2.51 78.2/81.2
OPN1_0000543_12221_0-- Linux CentOS Server Aborted 5/22/20 13:44:20 5/22/20 18:04:03 0.00 0.0/0.0
[Generated by wcgformat]


It looks as if some kind of hiccup occurred around 16:36-16:40 (22-05-2020). thinking
[May 23, 2020 11:20:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
manalog
Cruncher
Joined: Apr 9, 2015
Post Count: 18
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 3 copies for Quorum 2 Replication 2

Result Name OS type OS version App Version Number Status Sent Time Time Due /
Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit
OPN1_ 0000957_ 06768_ 2-- Linux 3.16.0-10-amd64 - In Progress 5/24/20 01:30:02 5/27/20 13:30:02 0.00 0.0 / 0.0
OPN1_ 0000957_ 06768_ 3-- Linux Ubuntu Ubuntu 18.04.3 LTS [5.0.0-23-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)] - In Progress 5/24/20 01:29:59 5/27/20 13:29:59 0.00 0.0 / 0.0
OPN1_ 0000957_ 06768_ 1-- Linux openSUSE Leap openSUSE Leap 15.1 [4.12.14-lp151.28.48-default] 717 Pending Verification 5/22/20 06:12:05 5/23/20 01:00:04 2.10 79.1 / 0.0
OPN1_ 0000957_ 06768_ 0-- Linux 2.6.32-696.1.1.el6.x86_64 717 Invalid 5/22/20 06:11:53 5/24/20 01:29:38 2.28 40.9 / 0.0

Same things here; it's the old Pentium D of my grandparents (Linux 3.16.0-10-amd64) and I can't go there just to abort this task :D
[May 24, 2020 9:50:33 AM]   Link   Report threatening or abusive post: please login first  Go to top 
PMH_UK
Veteran Cruncher
UK
Joined: Apr 26, 2007
Post Count: 769
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 3 copies for Quorum 2 Replication 2

2 more
Workunit Status

Project Name: OpenPandemics - COVID 19
Created: 05/21/2020 20:26:31
Name: OPN1_0000595_00126
Minimum Quorum: 2
Replication: 2


Result Name OS type OS version App Version Number Status Sent Time Time Due /
Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit
OPN1_ 0000595_ 00126_ 2-- Linux 3.13.0-170-generic 717 Server Aborted 5/23/20 15:12:12 5/24/20 04:37:02 0.00 0.0 / 0.0
OPN1_ 0000595_ 00126_ 1-- Linux 2.6.32-504.16.2.el6.x86_64 717 Valid 5/23/20 15:11:48 5/24/20 04:34:07 4.75 86.7 / 85.6
OPN1_ 0000595_ 00126_ 0-- Linux CentOS Linux CentOS Linux 7 (Core) [3.10.0-1062.18.1.el7.x86_64|libc 2.17 (GNU libc)] 717 Valid 5/23/20 14:54:52 5/23/20 19:23:23 2.64 84.5 / 85.6

Workunit Status

Project Name: OpenPandemics - COVID 19
Created: 05/21/2020 10:55:55
Name: OPN1_0000846_05657
Minimum Quorum: 2
Replication: 2


Result Name OS type OS version App Version Number Status Sent Time Time Due /
Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit
OPN1_ 0000846_ 05657_ 2-- Linux 2.6.32-696.1.1.el6.x86_64 717 Valid 5/23/20 03:59:22 5/24/20 01:28:56 2.70 32.9 / 54.1
OPN1_ 0000846_ 05657_ 1-- Linux 4.4.0-179-generic 717 Server Aborted 5/23/20 03:50:34 5/24/20 01:37:07 0.00 76.7 / 0.0
OPN1_ 0000846_ 05657_ 0-- Linux CentOS Linux CentOS Linux 7 (Core) [3.10.0-1062.18.1.el7.x86_64|libc 2.17 (GNU libc)] 717 Valid 5/23/20 03:49:35 5/23/20 08:52:10 2.38 75.3 / 54.1
----------------------------------------
Paul.
[May 24, 2020 9:57:08 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: 3 copies for Quorum 2 Replication 2

Anyone seeing this for other platforms than Linux?
----------------------------------------
[Edit 1 times, last edit by Former Member at May 24, 2020 10:24:13 AM]
[May 24, 2020 10:23:38 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2153
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 3 copies for Quorum 2 Replication 2

manalog, reformatted:
Result Name            OS                  AVN Status               Sent Time         Due / Return Time CPUh  Claimed/Granted
OPN1_0000957_06768_2-- Linux - In Progress 5/24/20 01:30:02 5/27/20 13:30:02 0.00 0.0/0.0
OPN1_0000957_06768_3-- Linux Ubuntu - In Progress 5/24/20 01:29:59 5/27/20 13:29:59 0.00 0.0/0.0
OPN1_0000957_06768_1-- Linux openSUSE Leap 717 Pending Verification 5/22/20 06:12:05 5/23/20 01:00:04 2.10 79.1/0.0
OPN1_0000957_06768_0-- Linux 717 Invalid 5/22/20 06:11:53 5/24/20 01:29:38 2.28 40.9/0.0
[Generated by wcgformat]


But ... why would you abort this task? If the other one fails to report in some way and you've aborted the task, then another copy (_4) will be sent out.
[May 24, 2020 10:48:22 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: 3 copies for Quorum 2 Replication 2

Memo to self.

Think uplinger wrote he fixed something to stop the clogging by Android causing Tasks committed to other platforms. Since nearly a week have a problem on Android getting the work specified... Tasks committed to other platforms. Argh.
[May 24, 2020 11:15:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: 3 copies for Quorum 2 Replication 2

manalog, reformatted:
Result Name            OS                  AVN Status               Sent Time         Due / Return Time CPUh  Claimed/Granted
OPN1_0000957_06768_2-- Linux - In Progress 5/24/20 01:30:02 5/27/20 13:30:02 0.00 0.0/0.0
OPN1_0000957_06768_3-- Linux Ubuntu - In Progress 5/24/20 01:29:59 5/27/20 13:29:59 0.00 0.0/0.0
OPN1_0000957_06768_1-- Linux openSUSE Leap 717 Pending Verification 5/22/20 06:12:05 5/23/20 01:00:04 2.10 79.1/0.0
OPN1_0000957_06768_0-- Linux 717 Invalid 5/22/20 06:11:53 5/24/20 01:29:38 2.28 40.9/0.0
[Generated by wcgformat]


But ... why would you abort this task? If the other one fails to report in some way and you've aborted the task, then another copy (_4) will be sent out.


Techs might put this away as some race condition, maybe 1 of the first 2 was not fast enough acknowledging receipt why _2 went out. Sheet happens, and the program took it's course with 1,087,198 validated yesterday.
[Jun 1, 2020 8:26:08 AM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread