Différences

Ci-dessous, les différences entre deux révisions de la page.

Lien vers cette vue comparative

Les deux révisions précédentes Révision précédente
Prochaine révision
Révision précédente
formation:astrosim2017gpu4dummies [2017/07/07 14:22]
equemene
formation:astrosim2017gpu4dummies [2017/07/10 18:52] (Version actuelle)
equemene [NBody, a simplistic simulator]
Ligne 392: Ligne 392:
   * ''​Pi_FP32_MWC_xPU_OpenCL_1_1_1_1_01000000_Device0_InMetro_titan.npz''​   * ''​Pi_FP32_MWC_xPU_OpenCL_1_1_1_1_01000000_Device0_InMetro_titan.npz''​
   * ''​Pi_FP32_MWC_xPU_OpenCL_1_1_1_1_01000000_Device0_InMetro_titan''​   * ''​Pi_FP32_MWC_xPU_OpenCL_1_1_1_1_01000000_Device0_InMetro_titan''​
 +
 +=== Exercice #7 : explore ''​PiXPU.py''​ with several simple configurations pour ''​PR=1''​ ===
 +
 +  * Without any parameters (the default ones) : 
 +    * what is the selected device ? How many itops (iterative operations per second) do you reach ?
 +  * With only the device parameter as ''​-d 1''​ to select ''#​1''​ for all the available devices :
 +    * What are the different ratios between the devices ? Which one is the most powerful ?
 +  * With the selector of device and increasing the number of iterations and the number of redo :
 +    * What arrive to itops values ? What is the typical variability on results ?
 +
 +<​code>/​scratch/​$USER/​PiXPU.py</​code>​
 +
 +<​code>​
 +/​scratch/​$USER/​PiXPU.py -d 1
 +/​scratch/​$USER/​PiXPU.py -d 2
 +/​scratch/​$USER/​PiXPU.py -d 3
 +</​code>​
 +
 +<​code>​
 +/​scratch/​$USER/​PiXPU.py -d 0 -i 100000000 -r 10
 +/​scratch/​$USER/​PiXPU.py -d 1 -i 100000000 -r 10
 +/​scratch/​$USER/​PiXPU.py -d 2 -i 100000000 -r 10
 +/​scratch/​$USER/​PiXPU.py -d 3 -i 100000000 -r 10
 +</​code>​
 +
 +=== Exercice #8 : explore ''​PiXPU.py''​ by increasing the Parallel Rate ''​PR''​ ===
 +
 +  * With a PR from ''​1''​ to ''​64''​ set by ''​-b''​ and ''​-e'',​ a the number of iterations of 1 billion, and 10 times and on default device
 +    * How decrease the elapsed time of 
 +  * With the selector of device and increasing the number of iterations and the number of redo :
 +    * What arrive to itops values ? What is the typical variability on results ?
 +
 +<​code>​./​PiXPU.py -d 0 -b 1 -e 32 -i 1000000000 -r 10</​code>​
 +
 +In this case, we define a gnuplot config file as follow. Adapt to your files and configuration.
 +<​code>​
 +set xlabel '​Parallel Rate'
 +set ylabel '​Itops'​
 +plot '​Pi_FP32_MWC_xPU_OpenCL_1_64_1_1_1000000000_Device0_InMetro_titan'​ using 1:9 title 'CPU with OpenCL'​
 +</​code>​
 +
 +{{ :​formation:​pimc_1_64_cpu.png?​600 |}}
 +
 +=== Exercice #9 : explore ''​PiXPU.py''​ with large PR on GPU (mostly power of 2) ===
 +
 +  * Explore with ''​PR''​ from ''​2048''​ to ''​32768''​ with a 128 step
 +  * For which ''​PR''​ the itops is the higher on you device ?
  
 To explore on this platform the GPU device (device #1) from 2048 to 32768 as parallel rates with a step of 128 and 1000000000 iterations: <​code>​ To explore on this platform the GPU device (device #1) from 2048 to 32768 as parallel rates with a step of 128 and 1000000000 iterations: <​code>​
Ligne 401: Ligne 448:
   * ''​Pi_FP32_MWC_xPU_OpenCL_2048_32768_1_1_1000000000_Device1_InMetro_titan''​   * ''​Pi_FP32_MWC_xPU_OpenCL_2048_32768_1_1_1000000000_Device1_InMetro_titan''​
  
-In this case, you can define a gnuplot ​confi file+In this case, you can define a gnuplot ​config ​file
 <​code>​ <​code>​
 set xlabel '​Parallel Rate' set xlabel '​Parallel Rate'
Ligne 410: Ligne 457:
 {{ :​formation:​pimc_2048_32768_gtx1080ti.png?​600 |}} {{ :​formation:​pimc_2048_32768_gtx1080ti.png?​600 |}}
  
 +=== Exercice #10 : explore ''​PiXPU.py''​ with around a large ''​PR''​ ===
 +
 +<​code>​./​PiXPU.py -d 1 -b $((2048-8)) -e $((2048+8)) -i 10000000000 -r 10</​code>​
 +
 +  * ''​Pi_FP32_MWC_xPU_OpenCL_2040_2056_1_1_10000000000_Device1_InMetro_titan''​
 +  * ''​Pi_FP32_MWC_xPU_OpenCL_2040_2056_1_1_10000000000_Device1_InMetro_titan.npz''​
 +
 +In this case, you can define a gnuplot config file
 +<​code>​
 +set xlabel '​Parallel Rate'
 +set ylabel '​Itops'​
 +plot '​Pi_FP32_MWC_xPU_OpenCL_2040_2056_1_1_10000000000_Device1_InMetro_titan'​ using 1:9 title 'GTX 1080 Ti'
 +</​code>​
 +
 +{{ :​formation:​pimc_2040_2056_gtx1080ti.png?​600 |}}
 ==== NBody, a simplistic simulator ==== ==== NBody, a simplistic simulator ====
  
 +The ''​NBody.py''​ code is a implementation of N-Body kepkerian system on OpenCL devices. ​
 +
 +It's available on:
 +  * on file: ''/​scratch/​AstroSim2017/​NBody.py''​ on workstations
 +  * on website: [[http://​www.cbp.ens-lyon.fr/​emmanuel.quemener/​documents/​Astrosim2017/​NBody.py|NBody.py]]
 +
 +Launch the code with a ''​N=2''​ on ''​1000''​ iterations with a graphical output
 +<​code>​
 +python NBody.py -n 2 -g -i 1000 
 +</​code>​
 +
 +{{ :​formation:​nbody_n2_gpu.png?​600 |}}
 +
 +
 +=== Exercice #10 : explore ''​NBody.py''​ with different devices ===
 +
 +=== Exercice #11 : explore ''​NBody.py''​ with steps and iterations ===
 +
 +=== Exercice #12 : explore ''​NBody.py''​ with Double Precision ===
  
 ===== Exploration with production codes ===== ===== Exploration with production codes =====
  
 ==== PKDGRAV3 ==== ==== PKDGRAV3 ====
 +
  
formation/astrosim2017gpu4dummies.1499430149.txt.gz · Dernière modification: 2017/07/07 14:22 par equemene