Ci-dessous, les différences entre deux révisions de la page.
Les deux révisions précédentes Révision précédente Prochaine révision | Révision précédente | ||
formation:astrosim2017gpu4dummies [2017/07/07 14:40] equemene |
formation:astrosim2017gpu4dummies [2017/07/10 18:52] (Version actuelle) equemene [NBody, a simplistic simulator] |
||
---|---|---|---|
Ligne 423: | Ligne 423: | ||
* With the selector of device and increasing the number of iterations and the number of redo : | * With the selector of device and increasing the number of iterations and the number of redo : | ||
* What arrive to itops values ? What is the typical variability on results ? | * What arrive to itops values ? What is the typical variability on results ? | ||
- | |||
<code>./PiXPU.py -d 0 -b 1 -e 32 -i 1000000000 -r 10</code> | <code>./PiXPU.py -d 0 -b 1 -e 32 -i 1000000000 -r 10</code> | ||
+ | In this case, we define a gnuplot config file as follow. Adapt to your files and configuration. | ||
+ | <code> | ||
+ | set xlabel 'Parallel Rate' | ||
+ | set ylabel 'Itops' | ||
+ | plot 'Pi_FP32_MWC_xPU_OpenCL_1_64_1_1_1000000000_Device0_InMetro_titan' using 1:9 title 'CPU with OpenCL' | ||
+ | </code> | ||
+ | {{ :formation:pimc_1_64_cpu.png?600 |}} | ||
+ | |||
+ | === Exercice #9 : explore ''PiXPU.py'' with large PR on GPU (mostly power of 2) === | ||
+ | |||
+ | * Explore with ''PR'' from ''2048'' to ''32768'' with a 128 step | ||
+ | * For which ''PR'' the itops is the higher on you device ? | ||
To explore on this platform the GPU device (device #1) from 2048 to 32768 as parallel rates with a step of 128 and 1000000000 iterations: <code> | To explore on this platform the GPU device (device #1) from 2048 to 32768 as parallel rates with a step of 128 and 1000000000 iterations: <code> | ||
Ligne 437: | Ligne 448: | ||
* ''Pi_FP32_MWC_xPU_OpenCL_2048_32768_1_1_1000000000_Device1_InMetro_titan'' | * ''Pi_FP32_MWC_xPU_OpenCL_2048_32768_1_1_1000000000_Device1_InMetro_titan'' | ||
- | In this case, you can define a gnuplot confi file | + | In this case, you can define a gnuplot config file |
<code> | <code> | ||
set xlabel 'Parallel Rate' | set xlabel 'Parallel Rate' | ||
Ligne 446: | Ligne 457: | ||
{{ :formation:pimc_2048_32768_gtx1080ti.png?600 |}} | {{ :formation:pimc_2048_32768_gtx1080ti.png?600 |}} | ||
+ | === Exercice #10 : explore ''PiXPU.py'' with around a large ''PR'' === | ||
+ | <code>./PiXPU.py -d 1 -b $((2048-8)) -e $((2048+8)) -i 10000000000 -r 10</code> | ||
+ | * ''Pi_FP32_MWC_xPU_OpenCL_2040_2056_1_1_10000000000_Device1_InMetro_titan'' | ||
+ | * ''Pi_FP32_MWC_xPU_OpenCL_2040_2056_1_1_10000000000_Device1_InMetro_titan.npz'' | ||
+ | |||
+ | In this case, you can define a gnuplot config file | ||
+ | <code> | ||
+ | set xlabel 'Parallel Rate' | ||
+ | set ylabel 'Itops' | ||
+ | plot 'Pi_FP32_MWC_xPU_OpenCL_2040_2056_1_1_10000000000_Device1_InMetro_titan' using 1:9 title 'GTX 1080 Ti' | ||
+ | </code> | ||
+ | |||
+ | {{ :formation:pimc_2040_2056_gtx1080ti.png?600 |}} | ||
==== NBody, a simplistic simulator ==== | ==== NBody, a simplistic simulator ==== | ||
+ | The ''NBody.py'' code is a implementation of N-Body kepkerian system on OpenCL devices. | ||
+ | |||
+ | It's available on: | ||
+ | * on file: ''/scratch/AstroSim2017/NBody.py'' on workstations | ||
+ | * on website: [[http://www.cbp.ens-lyon.fr/emmanuel.quemener/documents/Astrosim2017/NBody.py|NBody.py]] | ||
+ | |||
+ | Launch the code with a ''N=2'' on ''1000'' iterations with a graphical output | ||
+ | <code> | ||
+ | python NBody.py -n 2 -g -i 1000 | ||
+ | </code> | ||
+ | |||
+ | {{ :formation:nbody_n2_gpu.png?600 |}} | ||
+ | |||
+ | |||
+ | === Exercice #10 : explore ''NBody.py'' with different devices === | ||
+ | |||
+ | === Exercice #11 : explore ''NBody.py'' with steps and iterations === | ||
+ | |||
+ | === Exercice #12 : explore ''NBody.py'' with Double Precision === | ||
===== Exploration with production codes ===== | ===== Exploration with production codes ===== | ||
==== PKDGRAV3 ==== | ==== PKDGRAV3 ==== | ||
+ | |||