Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 781
Posts: 781   Pages: 79   [ Previous Page | 65 66 67 68 69 70 71 72 73 74 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 513282 times and has 780 replies Next Thread
widdershins
Veteran Cruncher
Scotland
Joined: Apr 30, 2007
Post Count: 674
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Just enough time for dinner before hitting that retry button 100 times per second. biggrin
[May 3, 2021 4:58:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Quick update about the status of the stress test. Over the weekend - (mostly later in the day yesterday), large numbers of batches started to complete. This allowed us to start packaging and sending them back to the researchers. However, the packaging phase is fairly IO intensive and that has induced additional load on the filesystem. This has resulted in the slow-downs you have seen. We have been digging into that yesterday and today to see if we can do anything to increase the throughput. We have made some adjustments to the configuration of the clustered filesystem that we hope should help but we don't expect a dramatic improvement.

The current outage was caused by what should have been a quick restart of the filesystem after making the configuration changes. Unfortunately, there were some hung processes from last monday that were still stick and we were not able to cleanly shutdown the cluster. As a result, when we started the cluster back up there were a couple of nodes that had been "kicked out" and the system has to go through a filesystem scan before we can bring the system back online.

Once we are back up we will see if the changes are an improvement.
[May 3, 2021 5:23:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
True54Blue
Advanced Cruncher
Joined: Nov 17, 2004
Post Count: 97
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

My last ten results are showing error as status. Is this a result of the server or has something gone terribly wrong with my computer? I'm noticing that others who returned those jobs at the same time are also showing error and now they're being sent out again.
----------------------------------------

[May 3, 2021 6:02:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 274
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

I think I might bump up the queue length on my main cruncher once this all passes. I've already exhausted all of my GPU tasks about an hour and a half ago, and the remaining CPU tasks in the queue are rapidly dwindling.

It's a tradeoff, though - longer queues mean slower turnaround time for units.
[May 3, 2021 6:40:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
True54Blue
Advanced Cruncher
Joined: Nov 17, 2004
Post Count: 97
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

I see some of them are being sent 7 times now. e.g.
OPNG_0020000_00103
OPNG_0026809_00108
----------------------------------------

[May 3, 2021 6:53:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2154
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Well, all my tasks are uploaded and reported now. Getting replacements though, seems at the moment not possible.
[May 3, 2021 7:12:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Jorlin
Advanced Cruncher
Deutschland
Joined: Jan 22, 2020
Post Count: 89
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Well, all my tasks are uploaded and reported now. Getting replacements though, seems at the moment not possible.


Not getting NVIDIA jobs, but Intel are coming in.
----------------------------------------

[May 3, 2021 7:14:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
DennyInDurham
Cruncher
USA
Joined: Aug 4, 2020
Post Count: 23
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Well, all my tasks are uploaded and reported now. Getting replacements though, seems at the moment not possible.

Yes, it would seem the filesystem restart didn't help much... apparently the Stress Test has found another bottleneck.
[May 3, 2021 7:17:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 274
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Got a few new CPU tasks not long ago, but uploads are spotty. Had over a page of them that were backed off past three hours that I just restarted, and I'm now back to the "some work, some don't" situation.

EDIT: Had to restart transfers on one of my Raspberry Pis as well.

EDIT 2: For a while, CPU work was going smoothly, and I got a few GPU units, but it was trying for more and not getting any. Now I just saw a bunch more GPU WUs coming in.

I wonder just how many teraflops (petaflops?) we're dumping into the project? devilish
----------------------------------------
[Edit 2 times, last edit by spRocket at May 3, 2021 10:03:15 PM]
[May 3, 2021 7:39:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
cehunt
Senior Cruncher
CANADA
Joined: Oct 10, 2011
Post Count: 172
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Hi:

I have a system which has an Intel i7-8700K CPU and a NVIDIA GeForce GTX 1070 GPU. I am interested in getting more bang for my buck.

On the task page, it is showing 0.929 CPU + 1 GPU when the GPU is crunching. Can I change the GPU setting to .125 and therefore increase the number of GPU WUs that the GPU is crunching on?

Clive
[May 3, 2021 10:55:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 781   Pages: 79   [ Previous Page | 65 66 67 68 69 70 71 72 73 74 | Next Page ]
[ Jump to Last Post ]
Post new Thread