Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 81
Posts: 81   Pages: 9   [ Previous Page | 1 2 3 4 5 6 7 8 9 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 535644 times and has 80 replies Next Thread
goben_2003
Advanced Cruncher
Joined: Jun 16, 2006
Post Count: 146
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: One terabyte written to my SSD since the stress test began

Thank you for all the help. I checked and trim is enabled. I reinstalled lmdisk with same settings as last time except i turned off dynamic disk and now it seems to be working! Mayby that was it, my system did not like the dynamic disk setting. After that i also changed so it will only contain the slots folder. I restarted a couple of times without starting up WCG. Final test is to see if it holds up after running the WCG for a day and then restart. Should i alwas do a manual sync to be safe? Mayby even do a copy first time so i dont loose anything?

You're welcome!
Yes, it seems the dynamic disk setting is causing issues for people.
Feel free to do a manual sync to be safe. I had it save the ramdisk to an image file just to be safe when I was testing it. I never actually had to use the image file though. To save the image file:
Open up "ImDisk Virtual Disk Driver". It is in the start menu and at Control Panel->ImDisk Virtual Disk Driver.
Select the drive.
Click save image.
I used the option with 0 offset(no MBR) since it is mounted to a folder instead of as a drive.
Click OK. This will pop up a warning if it is in use. Make sure BOINC is completely shutdown and you do not have any of the files open. I did not save it if it popped up a warning.
Select where you want to save it and what to call it.
Click Save.
----------------------------------------

[May 1, 2021 9:12:32 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Andyman
Cruncher
Joined: Apr 9, 2021
Post Count: 17
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: One terabyte written to my SSD since the stress test began

Great i will try that!
[May 1, 2021 10:28:26 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Dayle Diamond
Senior Cruncher
Joined: Jan 31, 2013
Post Count: 452
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: One terabyte written to my SSD since the stress test began

I come back after two days and this thread blew up!
Still no ETA from the WCG.

Many thanks to those posters helping each other make RAM disks, although personally I don't have a lot of RAM headroom in my system (just didn't need more before this bug), and won't buy more RAM if the fix is coming anytime soon.

Keep crunching & stay safe out there!

For those who are counting, I'm at 8.4 TB written.
----------------------------------------
[Edit 1 times, last edit by Dayle Diamond at May 2, 2021 6:33:26 AM]
[May 2, 2021 6:33:04 AM]   Link   Report threatening or abusive post: please login first  Go to top 
bozz4science
Advanced Cruncher
Germany
Joined: May 3, 2020
Post Count: 104
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: One terabyte written to my SSD since the stress test began

Thanks for providing this very helpful step by step guide on page 2!!

Once you know how to implement this RAMdisk solution, it is very straight forward. Initially had some trouble rebooting after setting it up, causing some weird BIOS issues at first.

Finally, it seems to work perfectly. Reason for shifting to the RAMdisk approach was my Evo970 Plus being completely trashed with writes accumulating about 25TB since the start of the stress test. Now sitting at <1MB/s most of the time except for the occasional new work download. Before, my NVME SSD registered a write activity somewhere between 70-90 MB/s with 13 concurrent GPU WUs.

I opted to mount the slot directory only and sized it accordingly to 4 GB (directory is 3.8GB). So far everthing works smoothly. Tasks finish an validate as before.

Kudos to you for this elegant approach!
----------------------------------------

AMD Ryzen 3700X @ 4.0 GHz / GTX1660S
Intel i5-4278U CPU @ 2.60GHz
[May 2, 2021 7:58:34 AM]   Link   Report threatening or abusive post: please login first  Go to top 
sam6861
Advanced Cruncher
Joined: Mar 31, 2020
Post Count: 107
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: One terabyte written to my SSD since the stress test began

Slow computer, looks ok.
Intel Atom N270 2GB RAM, no OpenCL, Linux, EXT4 single SATA SSD storage.
143860298 sectors (0.06 TiB) read, 136 days, /proc/diskstats column 3.
314117464 sectors (0.14 TiB) written, 136 days, , /proc/diskstats column 7.
643195368 sectors (0.29 TiB) written, lifetime from smartctl.

Fast computer, too much writes.
Intel i7-2600 16GB RAM, AMD RX 580, Linux Debian, BTRFS RAID1
53341744 sectors (0.02 TiB) read, 5 days, SATA SSD
54264880 sectors (0.02 TiB) read, 5 days, USB flash drive
3917875560 sectors (1.8 TiB) written. 5 days, SATA SSD
3917875560 sectors (1.8 TiB) written. 5 days, USB flash drive
13279749026 sectors (6.1 TiB) written. SATA SSD Lifetime from gsmartcontrol.
Also the fast computer is sort of going unresponsive at times due to a slow 10 MB/s USB flash drive and constant writes, and it sometimes drops CPU usage down from slow storage. Oh and I wonder which drive will fail first in a RAID1, USB flash drive or SATA SSD.

I am sure my Windows 10 with fast AMD 5500 XT have similar problems as well. OPNG is probably writing way too much checkpoints files just to complete in 20 minutes.

Can make Linux tmpfs to make it work faster and reduce amount of writes.
service boinc-client stop
cd /var/lib/boinc-client
mv slots slots_old
mkdir slots
mount -t tmpfs -o size=8G tmpfs slots
cp -rp slots_old/* slots/
service boinc-client start
Edit: Also do: chown boinc:boinc /var/lib/boinc-client/slots so Boinc can make more slot folders when needed.

The problem with TmpFS or RAM drive is, I somewhat don't want ARP1 tasks to lose all checkpoints on super long runtime of 30 hours when the computer crashes, freezes, or lost power.
----------------------------------------
[Edit 1 times, last edit by sam6861 at May 2, 2021 9:29:00 AM]
[May 2, 2021 9:12:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
maeax
Advanced Cruncher
Joined: May 2, 2007
Post Count: 142
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: One terabyte written to my SSD since the stress test began

Have set it to 1200 sec. (20 Min.)
Boincmanager - Preferences:
Request tasks to checkpoint at most every N seconds: This controls how often tasks save their state to disk, so they can be restarted later.
----------------------------------------
AMD Ryzen Threadripper PRO 3995WX 64-Cores/ AMD Radeon (TM) Pro W6600. OS Win11pro
[May 2, 2021 9:33:44 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1323
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: One terabyte written to my SSD since the stress test began

The problem is that these OPNG GPU-tasks don't obey BOINC preference 'write to disk' every ... seconds.
I've a slow GPU-card and 1 GPU-task with ~70 jobs need about 43 minutes elapsed and will write at least 70 times a checkpoint to disk.
You can imagine what's happening with the normal standard GPU's nowadays.
Those tasks will write every few seconds to disk.
[May 2, 2021 9:55:21 AM]   Link   Report threatening or abusive post: please login first  Go to top 
goben_2003
Advanced Cruncher
Joined: Jun 16, 2006
Post Count: 146
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: One terabyte written to my SSD since the stress test began

The problem is that these OPNG GPU-tasks don't obey BOINC preference 'write to disk' every ... seconds.
I've a slow GPU-card and 1 GPU-task with ~70 jobs need about 43 minutes elapsed and will write at least 70 times a checkpoint to disk.
You can imagine what's happening with the normal standard GPU's nowadays.
Those tasks will write every few seconds to disk.

I think the key thing here is that OPN* does respect the write to disk request within jobs. However, it writes the result of each job when it completes. For OPNG, each job normally takes less(intel gpu) or way less(more powerful discrete gpu) time to complete than the minimum write to disk request time.
----------------------------------------

[May 2, 2021 10:28:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
maeax
Advanced Cruncher
Joined: May 2, 2007
Post Count: 142
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: One terabyte written to my SSD since the stress test began

init_data.xml shows <disk_interval>1200.000000</disk_interval>
B U T a lot of writes in resourcenmonitor for the SSD, OMG
----------------------------------------
AMD Ryzen Threadripper PRO 3995WX 64-Cores/ AMD Radeon (TM) Pro W6600. OS Win11pro
[May 2, 2021 4:04:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nyanthiss
Cruncher
Joined: Nov 23, 2012
Post Count: 15
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: One terabyte written to my SSD since the stress test began

I don't think there is a need to write out intermediate results within a single job of a WU. (which is the case currently, AFAICT it only writes out the final result of each job).

OTOH, it's not enough to just write once at the end of WU. While there are GPUs which run the entire WU in maybe 2 minutes, there are GPUs which can take 1.5 hours per WU (and they still do useful work).
----------------------------------------
Intel Xeon E3-1231 v3
AMD A10 7800
AMD Ryzen 5 3500U
AMD Ryzen 1700X
AMD Ryzen 5900X
2x RaspberryPi, 1x Odroid
[May 3, 2021 1:30:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 81   Pages: 9   [ Previous Page | 1 2 3 4 5 6 7 8 9 | Next Page ]
[ Jump to Last Post ]
Post new Thread