Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 14
Posts: 14   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 25969 times and has 13 replies Next Thread
Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 307
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Time to Retire My GPU?

I guess that the time has finally come to turn my GPU off.

Whilst it would churn through one of the OPNG WUs in about 4-5 hours, since the restart it’s failing them all within about a second.

https://www.worldcommunitygrid.org/contribution/workunit/282352221

Error: boinc_get_opencl_ids() failed with error -1

I had not realised that the tasks would be changing during the interim.
[Apr 14, 2023 5:12:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Aperture_Science_Innovators
Advanced Cruncher
United States
Joined: Jul 6, 2009
Post Count: 139
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Time to Retire My GPU?

I don't know the details of your GPU, but I was facing this on one of my systems a few months back. It turned out that the power supply in the computer wasn't up to the task, and when the GPU loaded up with the OPNG WUs, the computer wouldn't fully hang, but the screen would go black, and the GPU would become nonfunctional until a restart. New(er) power supply sorted it out, and it runs properly now.

I *think* I've also had this happen when I try to remote into the systems (with RDP, to either Windows or Linux hosts). It seems to do something with the GPU rendering sent over the network that messes up local access to the GPU :/ I don't have enough firm data to claim this concretely though.
----------------------------------------

[Apr 14, 2023 5:48:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1842
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Time to Retire My GPU?

I guess that the time has finally come to turn my GPU off.

Whilst it would churn through one of the OPNG WUs in about 4-5 hours, since the restart it’s failing them all within about a second.

https://www.worldcommunitygrid.org/contribution/workunit/282352221

Error: boinc_get_opencl_ids() failed with error -1

I had not realised that the tasks would be changing during the interim.
What kind of GPU are you running? On what OS? Any recent updates, either OS or GPU drivers?

My programming laptop, with an NVidia GeForce GTX 1060, processes the latest batch of OPNG WUs in about 25 min each. Successfully...

Ralf
----------------------------------------

[Apr 14, 2023 5:58:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 1866
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Time to Retire My GPU?

My old but still working GTX660M crunches these 80+ "jobs" WU's, in a little bit over 6 hours.
Example: https://www.worldcommunitygrid.org/contribution/workunit/280054486

Still worth running it since the BOINC credit /WCG points those WU's gives/hour, is far higher than an extra CPU core can produce in the same number of hours.

My also old GTX980 Strix, will crunch two 80+ "jobs" OPNG at the same time, in around 15 minutes.
But until there are plenty OPNG tasks available, I'll let that computer rest.
----------------------------------------

----------------------------------------
[Edit 4 times, last edit by Grumpy Swede at Apr 14, 2023 6:32:34 PM]
[Apr 14, 2023 6:14:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 307
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
rose Re: Time to Retire My GPU?

I guess that the time has finally come to turn my GPU off.

Whilst it would churn through one of the OPNG WUs in about 4-5 hours, since the restart it’s failing them all within about a second.

https://www.worldcommunitygrid.org/contribution/workunit/282352221

Error: boinc_get_opencl_ids() failed with error -1

I had not realised that the tasks would be changing during the interim.
What kind of GPU are you running? On what OS? Any recent updates, either OS or GPU drivers?

My programming laptop, with an NVidia GeForce GTX 1060, processes the latest batch of OPNG WUs in about 25 min each. Successfully...

Ralf


OK, so I’ve started to investigate rather than just react.

Both of my machines are Ryzen 3900 running Ubuntu 22.04.2 and Boinc 7.20.5 fitted with GT710 GPUs. One has 64gb ram whilst the other has 16gb.

It appears that only one of them is downloading and attempting to run OPNG and that’s the smaller one which has been on holiday until a couple of weeks ago. When I restarted it I had ram problems and had to clean and reseat the dimms. It is possible that I’ll have to do the same with the GPU.

The other one, that’s been running throughout, has lost the OpenCL driver, I’ll reload that in the morning and see if that works OK.
[Apr 14, 2023 7:36:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 307
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Time to Retire My GPU?

So of course, as soon as I say all are failing one runs through to completion! It took 11 hours but it got there in the end.

I’ve reloaded the Nvidia drivers on the other machine and confirmed that OpenCL is now running so it’s now just a case of waiting until both machines download more OPNG to see what happens.
[Apr 15, 2023 11:26:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 307
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Time to Retire My GPU?

So both machines are now downloading and successfully running tasks in just over 11 hours.

However, about 15-20% of the tasks fail at 7.88 hours with time limit exceeded.

Is there any reason why some tasks have a shorter time limit - more importantly, is there any way of predicting which jobs will be affected?
[Apr 22, 2023 6:28:49 AM]   Link   Report threatening or abusive post: please login first  Go to top 
bfmorse
Senior Cruncher
US
Joined: Jul 26, 2009
Post Count: 274
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Time to Retire My GPU?

Is the “time limit exceeded” based solely on processing time or has the WU’s “Deadline” time been passed?

If it is the latter, you might check your queue values to make sure that setting is not too high causing the WU’s to sit idle until it is their turn to be processed. Meanwhile, the clock tics on and expires.
[Apr 22, 2023 6:56:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 1978
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Time to Retire My GPU?

Bryn Mawr:
So both machines are now downloading and successfully running tasks in just over 11 hours.

However, about 15-20% of the tasks fail at 7.88 hours with time limit exceeded.

Is there any reason why some tasks have a shorter time limit - more importantly, is there any way of predicting which jobs will be affected?

Read through this thread and you might be able to understand and correct the problem on your machine.

Adri
[Apr 22, 2023 8:27:28 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 307
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Time to Retire My GPU?

Bryn Mawr:
So both machines are now downloading and successfully running tasks in just over 11 hours.

However, about 15-20% of the tasks fail at 7.88 hours with time limit exceeded.

Is there any reason why some tasks have a shorter time limit - more importantly, is there any way of predicting which jobs will be affected?

Read through this thread and you might be able to understand and correct the problem on your machine.

Adri


No, purely based on the runtime.
[Apr 22, 2023 11:04:01 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 14   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread