Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 3
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 884 times and has 2 replies Next Thread
zdnko
Senior Cruncher
Joined: Dec 1, 2005
Post Count: 225
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Linux and Windows on the same WU?

I seemed to remember that the various replicas of a WU were processed on machines with the same OS

OPNG_0193377_00211_0	Linux Linuxmint	Linux Mint 21 [5.15.0-83-generic|libc 2.35]	Error	 
OPNG_0193377_00211_1 Microsoft Windows 10 Core x64 Edition, (10.00.19045.00) In Progress
OPNG_0193377_00211_2 Microsoft Windows 11 Professional x64 Edition, (10.00.22621.00) In Progress


https://www.worldcommunitygrid.org/contribution/workunit/379307359
[Sep 17, 2023 4:01:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 976
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Linux and Windows on the same WU?

The possibility of a switch is to do with how platforms are decided at request time; the choice of host is based on whether there are any active tasks[*1] for the WU when a task is offered... This apparently unexpected behaviour is more likely to show up for SCC1 and OPN1/G as they use Adaptive Replication so there may only be one task in the field at once...

From the work-unit you cited:
    OPNG_0193377_00211_0    Returned: 2023-09-17 15:25:51 UTC
OPNG_0193377_00211_1 Sent: 2023-09-17 15:25:55 UTC
OPNG_0193377_00211_2 Sent: 2023-09-17 15:25:58 UTC

In this case, the Linux task went out as a singleton (presumably the host was considered "reliable") but it failed to locate an OpenCL device when it tried to run... Because it failed (rather than having returned but not yet been validated/verified), the next attempt to issue a task for the work-unit could go to any platform. Apparently. the first Windows machine wasn't considered to be an AR candidate :-)

I see platform switches quite often, and they aren't always to do with Adaptive Replication :-) On some occasions I've seen a pair of MCM1 tasks go to Windows machines and fail (download errors or "failure to start process", usually!) which results in both returning a failure at about the same time; the next requests saw there were no active tasks and Linux got the retries...

Cheers - Al.

[*1] Active tasks are In Progress, Pending Validation/Verification and No Reply (in case it replies before a retry does).
[Sep 17, 2023 7:53:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
zdnko
Senior Cruncher
Joined: Dec 1, 2005
Post Count: 225
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Linux and Windows on the same WU?

thanks
[Sep 18, 2023 4:47:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread