Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Active Research Forum: OpenPandemics - COVID-19 Project Thread: Theory Crafting OPN CPU vs GPU |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 15
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
EDIT: After waiting a week for my PPD to stop fluctuating and finding a good benchmark program I have better data with which to estimate the maximum performance gains I could get once OPN adds GPU task. The benchmark I chose is called FFTW (web site is fttw.org), it's been around for decades, and it is included it the Phoronix Test Suite so it should be readily available to anyone no matter which OS you use.
----------------------------------------I decided on FFTW for three simple reasons. First, it displays it’s results in MegaFLOPS which makes it easy to incorporate the data with what I know about my video cards. Second, it will run two tests where the first one is based on the CPU core’s Floating-Point Unit and then the second test combines the FPU and the CPU core’s SSE2 instruction set into one result. This is important because I am about 90% sure OPN tasks for the PC use SSE2. Therefore I will be using the second set of results below. The final reason I chose FFTW is because it can be small and fast which means it is a great estimate of of the raw hardware’s maximum performance. That’s important because the numbers I have for my GPU’s are all theoretical maximums. According to FFTW my Ryzen 7 3700X CPU's do 15211 MFLOPS on each core (far higher than my initial estimates). My Xeon D1520 scores 7662 MFLOPS per core. So the combined floating-point power of my CPU’s is about: [15211 x 8 = 121688] + [15211 x8 = 121688] + [7662 x4 = 30644] = 274020 MFLOPS or 274 GFLOPS. Each of my three boxes has it's own GPU listed from slowest to fastest FP32 FLOPS: Nvidia GTX 750-ti @ 1.389 TFLOPS Nvidia GTX 970 @ 3.920 TFLOPS Nvidia RTX 2070 @ 7.465 TFLOPS ______________________________ 3 GPU's Total @ 12.774 TFLOPS 12774 / 274 = 46.510948905 Therefore the raw hardware FLOPS of my 3 GPU's is 4651% of what my 3 CPU's can do. On my CPU's the OPN sofware is running at (My current PPD 200,173.11 / 700) 285.96 GFLOPS per day. Assuming that the programming overhead remains about the same when OPN is ported to GPU I could see an increase of: 46.510948905 x 285.96 = 13,300 GFLOPS per day or 9,310,189 PPD. This is almost six times lower than my original estimate but still a very substantial increase. [Edit 7 times, last edit by Former Member at Jul 11, 2020 7:32:19 AM] |
||
|
William Albert
Cruncher Joined: Apr 5, 2020 Post Count: 34 Status: Offline Project Badges: |
According to cpdn.org's stat's page my Ryzen 7 3700X CPU's do 7.08 GFLOPS each. That figure is per logical processor, not per CPU. You need to multiply it by the number of processors you've assigned, and multiply it again by the percentage of CPU time you've allocated. GPUs still have the potential to be more powerful than CPUs, particularly if the application is able to leverage single precision or half-precision math, but the difference isn't going to be as extreme as your calculation would suggest. |
||
|
hwierzbicki
Advanced Cruncher Joined: May 1, 2016 Post Count: 55 Status: Offline Project Badges: |
What William Albert said. On average, the 3700X seems capable of 89.20 GFLOPS .
---------------------------------------- |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7219 Status: Offline Project Badges: |
12774 / 17.16 = 744.405594406 Therefor the raw hardware FLOPS of my 3 GPU's is 744.405594406% of what my 3 CPU's can do. Sorry to have to make a basic correction, but according to your arithmetic, you can do 744 times as much with the gpu's. If you want to use per cent, you need to move the decimal point 2 places to the right. the percent is 74,441 % greater. (I rounded off the number.) Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
According to cpdn.org's stat's page my Ryzen 7 3700X CPU's do 7.08 GFLOPS each. That figure is per logical processor, not per CPU. You need to multiply it by the number of processors you've assigned, and multiply it again by the percentage of CPU time you've allocated. GPUs still have the potential to be more powerful than CPUs, particularly if the application is able to leverage single precision or half-precision math, but the difference isn't going to be as extreme as your calculation would suggest. According to cpdn.org's stat's page my Ryzen 7 3700X CPU's do 7.08 GFLOPS each. That figure is per logical processor, not per CPU. You need to multiply it by the number of processors you've assigned, and multiply it again by the percentage of CPU time you've allocated. GPUs still have the potential to be more powerful than CPUs, particularly if the application is able to leverage single precision or half-precision math, but the difference isn't going to be as extreme as your calculation would suggest. Yes, there is something here that I missed. I am playing with the Phoronix Test Suite today looking for a benchmark that can give me something close to raw hardware (maybe even with SSE2?) FLOPS per core. Although that will not scale 100% per SMT logical processor as William Albert suggests, it will give a better estimate to base the theory crafting on. I don't want to use the Whetstone benchmarks because it aims for "real world complexity" instead of raw FLOPS + SSE2. The reason I am interested in SSE2 is because I am almost positive OPN uses it AutoDock. 12774 / 17.16 = 744.405594406 Therefor the raw hardware FLOPS of my 3 GPU's is 744.405594406% of what my 3 CPU's can do. Sorry to have to make a basic correction, but according to your arithmetic, you can do 744 times as much with the gpu's. If you want to use per cent, you need to move the decimal point 2 places to the right. the percent is 74,441 % greater. (I rounded off the number.) Cheers On a side note, My PPD has nearly doubled over night. Combined that with my search for better benchmarks and it means I'll have to recalculate most of my original post. I'll wait another day or three and see where my PPD is before going forward. [Edit 2 times, last edit by Former Member at Jul 5, 2020 6:18:20 PM] |
||
|
Aurum
Master Cruncher The Great Basin Joined: Dec 24, 2017 Post Count: 2370 Status: Offline Project Badges: |
On average, the 3700X seems capable of 89.20 GFLOPS. Thanks for that list. Be nice if it was sortable. There are some major errors:Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz [Family 6 Model 63 Stepping 0] 25 4.00 2.78 11.11An E5-2698v4 is 20c40t and not much less powerful than an E5-2699v4. Seeing numerous other examples. Also they list fractional Average cores/computer so pretty much every one of their values is less than full strength. I guess one must only use the GFLOPS/core values. Anyone know of a similar list for GPUs? ...KRI please cancel all shadow-banning [Edit 2 times, last edit by Aurum420 at Jul 5, 2020 6:46:39 PM] |
||
|
Jim1348
Veteran Cruncher USA Joined: Jul 13, 2009 Post Count: 1066 Status: Offline Project Badges: |
Anyone know of a similar list for GPUs? This is what I use. The GFLOPS (both single and double precision) are listed for the later-generation cards at the bottom. AMD:https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units Nvidia:https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7219 Status: Offline Project Badges: |
On average, the 3700X seems capable of 89.20 GFLOPS. Thanks for that list. Be nice if it was sortable. It is sortable with a minimal work. Merely highlight from the column headers on down and then paste into a spreadsheet. Once in the spreadsheet it is easily sortable any which way you want. Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
WMCheerman
Cruncher Joined: Nov 20, 2009 Post Count: 13 Status: Offline Project Badges: |
I always took the amount of points my CPU gets on WCG and divide it by 700 to get my GFLOPS, is that not a correct way as well?
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
OK, I have reworked my original post to include new data.
|
||
|
|