Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 41
Posts: 41   Pages: 5   [ 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 8099 times and has 40 replies Next Thread
[CSF] Aleksey Belkov
Cruncher
Russian Federation
Joined: Feb 28, 2013
Post Count: 3
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
OPN1 WU memory leaks?

Good day
Today I noticed Memory Exhaustion on my main workstation due almost all opn1 WUs used much more Virtual Memory then usual(up to 2.5 GiB vs 200-250 MiB per WU).


After restarting BOINC client/manager on this host, Virtual Memory usage by opn1 WUs normalized(200-250 MiB per WU).

But later I noticed an increase in the consumption of virtual memory by opn1 WUs again. And the growth of consumption continues!

See posts below At the same time, I do not observe similar problems on other hosts:

Any thoughts what could be the cause of such a problem?

Tech Info:
Main host: AMD Rysen Threadripper 2950X (c16/th32) / 64 GB ECC RAM / OS: Windows 10 Enterprise LTSC x64
Other(7) hosts: Intel Core i5 3470 (c4/th4) / 8-16 GB RAM / OS: Windows 10 Enterprise LTSC x64
----------------------------------------
[Edit 3 times, last edit by [CSF] Aleksey Belkov at Oct 6, 2021 1:47:51 AM]
[Oct 4, 2021 6:21:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12436
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OPN1 WU memory leaks?

Aleksey

I cannot answer your question, but we have suddenly seen the batch numbers going up by thousands with very small numbers in the batches, as posted in Work Avaiable. There may be a connection.

Mike
[Oct 4, 2021 7:29:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1323
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OPN1 WU memory leaks?

I've seen the same with these tasks with memory extend
OPN1_0064414_00035_0 613.88 MB 492.34 MB from 2.6GB at first
OPN1_0063979_00005_0 2611.86 MB 2501.57 MB
OPN1_0064151_00026_1 295.47 MB 721.60 MB from 2.5 GB at first
OPN1_0064373_00024_1 Computation error (29539,)

Memory reduced after suspend task (LAIM off) and resume.
[Oct 4, 2021 8:09:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[CSF] Aleksey Belkov
Cruncher
Russian Federation
Joined: Feb 28, 2013
Post Count: 3
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OPN1 WU memory leaks?

Mike,
Thank you for your comment.


It would be great somehow draw the attention of the project engineers to this problem.
[Oct 4, 2021 8:21:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[CSF] Aleksey Belkov
Cruncher
Russian Federation
Joined: Feb 28, 2013
Post Count: 3
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OPN1 WU memory leaks?

Crystal Pellet
Thanks for feedback.
Now I know that this is not a local host-specific problem.

Now I see that the problem has started to reveal itself on other hosts in my crunch-pool.
To rule out possible problems with memory exhaustion(as a workaround), I decide to temporally exclude OpenPandemics WUs from receiving and abort all cached opn1 WUs.
----------------------------------------
[Edit 1 times, last edit by [CSF] Aleksey Belkov at Oct 4, 2021 10:05:35 PM]
[Oct 4, 2021 9:56:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7697
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OPN1 WU memory leaks?

That runaway memory problem has hit me also. So far it has only affected Linux hosts and not the one Windows machine I am running. I have suspended all of the running OPN programs and will restart them one at a time to try to nurse them to completion. I will switch my allocation to MCM for the time being until there is a fix for the OPN problem.
On a 32 thread system with 16gb of memory the sysmon showed 15.9gb of memory in use plus a ton of swap space. This system usually runs under 4 gb of memory in use. The system was sluggish as a drunk snail until I got a bunch of the jobs suspended.
Edit:spelling
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
----------------------------------------
[Edit 2 times, last edit by Sgt.Joe at Oct 5, 2021 12:46:20 AM]
[Oct 5, 2021 12:44:19 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2209
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OPN1 WU memory leaks?

I'll abort all my OPN1 tasks from my Low memory Laptop, and start with 100% MCM1 tasks instead.. I hope the project staff takes care of this problem soon.

Edit: Aborted them from my 16GB computer also.
Goodnight for now, OPN1.
----------------------------------------
[Edit 2 times, last edit by Grumpy Swede at Oct 5, 2021 1:04:46 AM]
[Oct 5, 2021 1:02:01 AM]   Link   Report threatening or abusive post: please login first  Go to top 
pvh513
Senior Cruncher
Joined: Feb 26, 2011
Post Count: 260
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OPN1 WU memory leaks?

Not sure if it due to a memory leak, but the memory consumption has gone up on my linux machines. I now frequently run into oom-kill due to the machine being out of memory (I typically have 1 GiB per thread). IIRC the memory consumption of OPN used to be quite modest.

Edit: I have now also aborted all OPN tasks, they were simply putting too much strain on my computers.
----------------------------------------
[Edit 1 times, last edit by pvh513 at Oct 5, 2021 6:12:59 AM]
[Oct 5, 2021 2:13:28 AM]   Link   Report threatening or abusive post: please login first  Go to top 
wujj123456
Cruncher
Joined: Jun 9, 2010
Post Count: 38
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OPN1 WU memory leaks?

Same here. Many of my OPN1 WUs are now exceeding memory usage of ARP1, which I had special config for. It would be good to know if it's a memleak bug, or it's intended behavior change. The latter would likely make me apply similar limits to ensure I can fill all the cores without being limited by memory.
[Oct 5, 2021 6:46:22 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1323
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OPN1 WU memory leaks?

To add to my previous post:

My high memory OPN's were on a Windows 10 laptop with 8GB RAM and 4 threads running.
The result of the task that crashed because of 'out of memory': https://www.worldcommunitygrid.org/contribution/results/1951023765/log
*** Dump of the Process Statistics: ***

- I/O Operations Counters -
Read: 1047947, Write: 0, Other 288187

- I/O Transfers Counters -
Read: 0, Write: 93015, Other 0

- Paged Pool Usage -
QuotaPagedPoolUsage: 362536, QuotaPeakPagedPoolUsage: 362672
QuotaNonPagedPoolUsage: 38408, QuotaPeakNonPagedPoolUsage: 38408

- Virtual Memory Usage -
VirtualSize: 1955250176, PeakVirtualSize: 1973018624

- Pagefile Usage -
PagefileUsage: 1853767680, PeakPagefileUsage: 1872244736

- Working Set Size -
WorkingSetSize: 1718099968, PeakWorkingSetSize: 1735819264, PageFaultCount: 10977269

*** Dump of thread ID 3116 (state: Waiting): ***


Normally the CPU-OPN's only do 1 or 2 jobs with max 50 compounds, but the task that crashed was busy with job 21.

Edit: Added extract of log
----------------------------------------
[Edit 1 times, last edit by Crystal Pellet at Oct 5, 2021 7:01:50 AM]
[Oct 5, 2021 6:55:46 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 41   Pages: 5   [ 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread