formation:astrosim2017gpu4dummies

Différences

Ci-dessous, les différences entre deux révisions de la page.

--- formation:astrosim2017gpu4dummies [2017/07/07 14:22]
equemene
+++ formation:astrosim2017gpu4dummies [2017/07/10 18:52] (Version actuelle)
equemene [NBody, a simplistic simulator]
@@ Ligne 392: / Ligne 392: @@
   * ''Pi_FP32_MWC_xPU_OpenCL_1_1_1_1_01000000_Device0_InMetro_titan.npz''
   * ''Pi_FP32_MWC_xPU_OpenCL_1_1_1_1_01000000_Device0_InMetro_titan''
+=== Exercice #7 : explore ''PiXPU.py'' with several simple configurations pour ''PR=1'' ===
+  * Without any parameters (the default ones) :
+    * what is the selected device ? How many itops (iterative operations per second) do you reach ?
+  * With only the device parameter as ''-d 1'' to select ''#1'' for all the available devices :
+    * What are the different ratios between the devices ? Which one is the most powerful ?
+  * With the selector of device and increasing the number of iterations and the number of redo :
+    * What arrive to itops values ? What is the typical variability on results ?
+<code>/scratch/$USER/PiXPU.py</code>
+<code>
+/scratch/$USER/PiXPU.py -d 1
+/scratch/$USER/PiXPU.py -d 2
+/scratch/$USER/PiXPU.py -d 3
+</code>
+<code>
+/scratch/$USER/PiXPU.py -d 0 -i 100000000 -r 10
+/scratch/$USER/PiXPU.py -d 1 -i 100000000 -r 10
+/scratch/$USER/PiXPU.py -d 2 -i 100000000 -r 10
+/scratch/$USER/PiXPU.py -d 3 -i 100000000 -r 10
+</code>
+=== Exercice #8 : explore ''PiXPU.py'' by increasing the Parallel Rate ''PR'' ===
+  * With a PR from ''1'' to ''64'' set by ''-b'' and ''-e'', a the number of iterations of 1 billion, and 10 times and on default device
+    * How decrease the elapsed time of
+  * With the selector of device and increasing the number of iterations and the number of redo :
+    * What arrive to itops values ? What is the typical variability on results ?
+<code>./PiXPU.py -d 0 -b 1 -e 32 -i 1000000000 -r 10</code>
+In this case, we define a gnuplot config file as follow. Adapt to your files and configuration.
+<code>
+set xlabel 'Parallel Rate'
+set ylabel 'Itops'
+plot 'Pi_FP32_MWC_xPU_OpenCL_1_64_1_1_1000000000_Device0_InMetro_titan' using 1:9 title 'CPU with OpenCL'
+</code>
+{{ :formation:pimc_1_64_cpu.png?600 |}}
+=== Exercice #9 : explore ''PiXPU.py'' with large PR on GPU (mostly power of 2) ===
+  * Explore with ''PR'' from ''2048'' to ''32768'' with a 128 step
+  * For which ''PR'' the itops is the higher on you device ?
 To explore on this platform the GPU device (device #1) from 2048 to 32768 as parallel rates with a step of 128 and 1000000000 iterations: <code>
@@ Ligne 401: / Ligne 448: @@
   * ''Pi_FP32_MWC_xPU_OpenCL_2048_32768_1_1_1000000000_Device1_InMetro_titan''
-In this case, you can define a gnuplot confi file
+In this case, you can define a gnuplot config file
 <code>
 set xlabel 'Parallel Rate'
@@ Ligne 410: / Ligne 457: @@
 {{ :formation:pimc_2048_32768_gtx1080ti.png?600 |}}
+=== Exercice #10 : explore ''PiXPU.py'' with around a large ''PR'' ===
+<code>./PiXPU.py -d 1 -b $((2048-8)) -e $((2048+8)) -i 10000000000 -r 10</code>
+  * ''Pi_FP32_MWC_xPU_OpenCL_2040_2056_1_1_10000000000_Device1_InMetro_titan''
+  * ''Pi_FP32_MWC_xPU_OpenCL_2040_2056_1_1_10000000000_Device1_InMetro_titan.npz''
+In this case, you can define a gnuplot config file
+<code>
+set xlabel 'Parallel Rate'
+set ylabel 'Itops'
+plot 'Pi_FP32_MWC_xPU_OpenCL_2040_2056_1_1_10000000000_Device1_InMetro_titan' using 1:9 title 'GTX 1080 Ti'
+</code>
+{{ :formation:pimc_2040_2056_gtx1080ti.png?600 |}}
 ==== NBody, a simplistic simulator ====
+The ''NBody.py'' code is a implementation of N-Body kepkerian system on OpenCL devices.
+It's available on:
+  * on file: ''/scratch/AstroSim2017/NBody.py'' on workstations
+  * on website: [[http://www.cbp.ens-lyon.fr/emmanuel.quemener/documents/Astrosim2017/NBody.py|NBody.py]]
+Launch the code with a ''N=2'' on ''1000'' iterations with a graphical output
+<code>
+python NBody.py -n 2 -g -i 1000
+</code>
+{{ :formation:nbody_n2_gpu.png?600 |}}
+=== Exercice #10 : explore ''NBody.py'' with different devices ===
+=== Exercice #11 : explore ''NBody.py'' with steps and iterations ===
+=== Exercice #12 : explore ''NBody.py'' with Double Precision ===
 ===== Exploration with production codes =====
 ==== PKDGRAV3 ====

formation/astrosim2017gpu4dummies.1499430149.txt.gz · Dernière modification: 2017/07/07 14:22 par equemene

Rechercher

Translations

Piste:

Piste: • stream4wheezy • cp2k4wheezy • cuda4wheezy

Boîte à outils