Ci-dessous, les différences entre deux révisions de la page.
Les deux révisions précédentes Révision précédente Prochaine révision | Révision précédente | ||
developpement:activites:integration:xeonphi4wheezy [2014/04/24 08:28] equemene [Installation des paquets MPSS] |
developpement:activites:integration:xeonphi4wheezy [2015/01/07 10:04] (Version actuelle) |
||
---|---|---|---|
Ligne 221: | Ligne 221: | ||
</code> | </code> | ||
- | Lancement avec la commande suivante : ''micctrl --initdefaults'' dont la sortie est la suivante | + | Lancement avec la commande suivante : ''micctrl \-\-initdefaults'' dont la sortie est la suivante |
<code> | <code> | ||
[Warning] mic0: Generating compat network config file. This will be removed in the 3.2 release | [Warning] mic0: Generating compat network config file. This will be removed in the 3.2 release | ||
Ligne 287: | Ligne 287: | ||
Vérification du nombre de processeurs avec ''cat /proc/cpuinfo | grep ^processor | wc -l'' lequel donne ''244''. | Vérification du nombre de processeurs avec ''cat /proc/cpuinfo | grep ^processor | wc -l'' lequel donne ''244''. | ||
+ | ==== Installation du composant OpenCL ==== | ||
- | ===== Historique ===== | + | === Installation des composants OpenCL standards === |
+ | <code> | ||
+ | apt-get install amd-clinfo amd-libopencl1 amd-opencl-icd | ||
+ | </code> | ||
- | ==== Des débuts difficiles pour une prise en main impossible ==== | + | === Vérification du fonctionnement de OpenCL avec clinfo === |
- | La société Intel a mis à disposition fin octobre 2013 un Xeon Phi 7120P à disposition de Emmanuel Quémener, du Centre Blaise Pascal. | + | <code> |
+ | Number of platforms: 1 | ||
+ | Platform Profile: FULL_PROFILE | ||
+ | Platform Version: OpenCL 1.2 AMD-APP (938.2) | ||
+ | Platform Name: AMD Accelerated Parallel Processing | ||
+ | Platform Vendor: Advanced Micro Devices, Inc. | ||
+ | Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices | ||
- | La carte Intel Xeon Phi 7120P se présente sous forme d'une carte PCI Express d'une taille équivalente à celle d'une carte Nvidia GTX Titan ou Nvidia Tesla. Elle nécessite 2 connecteurs 6 broches pour son alimentation, deux emplacements PCI, un port PCI-Express 16x et une longueur suffisante. | + | Platform Name: AMD Accelerated Parallel Processing |
+ | Number of devices: 1 | ||
+ | Device Type: CL_DEVICE_TYPE_CPU | ||
+ | Device ID: 4098 | ||
+ | Board name: | ||
+ | Max compute units: 32 | ||
+ | Max work items dimensions: 3 | ||
+ | Max work items[0]: 1024 | ||
+ | Max work items[1]: 1024 | ||
+ | Max work items[2]: 1024 | ||
+ | Max work group size: 1024 | ||
+ | Preferred vector width char: 16 | ||
+ | Preferred vector width short: 8 | ||
+ | Preferred vector width int: 4 | ||
+ | Preferred vector width long: 2 | ||
+ | Preferred vector width float: 4 | ||
+ | Preferred vector width double: 0 | ||
+ | Native vector width char: 16 | ||
+ | Native vector width short: 8 | ||
+ | Native vector width int: 4 | ||
+ | Native vector width long: 2 | ||
+ | Native vector width float: 4 | ||
+ | Native vector width double: 0 | ||
+ | Max clock frequency: 1200Mhz | ||
+ | Address bits: 64 | ||
+ | Max memory allocation: 16898804736 | ||
+ | Image support: Yes | ||
+ | Max number of images read arguments: 128 | ||
+ | Max number of images write arguments: 8 | ||
+ | Max image 2D width: 8192 | ||
+ | Max image 2D height: 8192 | ||
+ | Max image 3D width: 2048 | ||
+ | Max image 3D height: 2048 | ||
+ | Max image 3D depth: 2048 | ||
+ | Max samplers within kernel: 16 | ||
+ | Max size of kernel argument: 4096 | ||
+ | Alignment (bits) of base address: 1024 | ||
+ | Minimum alignment (bytes) for any datatype: 128 | ||
+ | Single precision floating point capability | ||
+ | Denorms: Yes | ||
+ | Quiet NaNs: Yes | ||
+ | Round to nearest even: Yes | ||
+ | Round to zero: Yes | ||
+ | Round to +ve and infinity: Yes | ||
+ | IEEE754-2008 fused multiply-add: Yes | ||
+ | Cache type: Read/Write | ||
+ | Cache line size: 64 | ||
+ | Cache size: 32768 | ||
+ | Global memory size: 67595218944 | ||
+ | Constant buffer size: 65536 | ||
+ | Max number of constant args: 8 | ||
+ | Local memory type: Global | ||
+ | Local memory size: 32768 | ||
+ | Kernel Preferred work group size multiple: 1 | ||
+ | Error correction support: 0 | ||
+ | Unified memory for Host and Device: 1 | ||
+ | Profiling timer resolution: 1 | ||
+ | Device endianess: Little | ||
+ | Available: Yes | ||
+ | Compiler available: Yes | ||
+ | Execution capabilities: | ||
+ | Execute OpenCL kernels: Yes | ||
+ | Execute native function: Yes | ||
+ | Queue properties: | ||
+ | Out-of-Order: No | ||
+ | Profiling : Yes | ||
+ | Platform ID: 0x7f926ed1bce0 | ||
+ | Name: Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz | ||
+ | Vendor: GenuineIntel | ||
+ | Device OpenCL C version: OpenCL C 1.2 | ||
+ | Driver version: 2.0 (sse2,avx) | ||
+ | Profile: FULL_PROFILE | ||
+ | Version: OpenCL 1.2 AMD-APP (938.2) | ||
+ | Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt | ||
+ | </code> | ||
+ | |||
+ | Seuls les processeurs CPU sont détectés | ||
+ | |||
+ | === Installation des composants Python OpenCL === | ||
+ | |||
+ | <code> | ||
+ | python-pyopencl python-pyopencl-doc | ||
+ | </code> | ||
+ | |||
+ | === Vérification du fonctionnement === | ||
+ | |||
+ | <code> | ||
+ | python /usr/share/doc/python-pyopencl-doc/examples/benchmark-all.py | ||
+ | </code> | ||
+ | |||
+ | <code> | ||
+ | ('Execution time of test without OpenCL: ', 7.673499822616577, 's') | ||
+ | =============================================================== | ||
+ | ('Platform name:', 'AMD Accelerated Parallel Processing') | ||
+ | ('Platform profile:', 'FULL_PROFILE') | ||
+ | ('Platform vendor:', 'Advanced Micro Devices, Inc.') | ||
+ | ('Platform version:', 'OpenCL 1.2 AMD-APP (938.2)') | ||
+ | --------------------------------------------------------------- | ||
+ | ('Device name:', 'Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz') | ||
+ | ('Device type:', 'CPU') | ||
+ | ('Device memory: ', 64463, 'MB') | ||
+ | ('Device max clock speed:', 1200, 'MHz') | ||
+ | ('Device compute units:', 32) | ||
+ | Execution time of test: 0.00112901 s | ||
+ | Results OK | ||
+ | root@grizzly:~# ^C | ||
+ | root@grizzly:~# python /usr/share/doc/python-pyopencl-doc/examples/benchmark-all.py | ||
+ | ('Execution time of test without OpenCL: ', 7.43899393081665, 's') | ||
+ | =============================================================== | ||
+ | ('Platform name:', 'AMD Accelerated Parallel Processing') | ||
+ | ('Platform profile:', 'FULL_PROFILE') | ||
+ | ('Platform vendor:', 'Advanced Micro Devices, Inc.') | ||
+ | ('Platform version:', 'OpenCL 1.2 AMD-APP (938.2)') | ||
+ | --------------------------------------------------------------- | ||
+ | ('Device name:', 'Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz') | ||
+ | ('Device type:', 'CPU') | ||
+ | ('Device memory: ', 64463, 'MB') | ||
+ | ('Device max clock speed:', 1200, 'MHz') | ||
+ | ('Device compute units:', 32) | ||
+ | Execution time of test: 0.000632971 s | ||
+ | Results OK | ||
+ | </code> | ||
+ | |||
+ | Intel publie ses [[http://software.intel.com/en-us/vcsource/tools/opencl-sdk-xe|versions de OpenCL]]. | ||
+ | |||
+ | La dernière version est la 3.2.1.16712 | ||
+ | |||
+ | ==== Récupération de la dernière version : ==== | ||
+ | |||
+ | <code> | ||
+ | cd /root | ||
+ | wget http://registrationcenter.intel.com/irc_nas/3809/intel_sdk_for_ocl_applications_xe_2013_r3_sdk_3.2.1.16712_x64.tgz | ||
+ | |||
+ | tar xzf intel_sdk_for_ocl_applications_xe_2013_r3_sdk_3.2.1.16712_x64.tgz | ||
+ | cd /root/intel_sdk_for_ocl_applications_xe_2013_r3_sdk_3.2.1.16712_x64 | ||
+ | </code> | ||
+ | |||
+ | ==== Conversion des RPM en DEB ==== | ||
+ | |||
+ | <code> | ||
+ | ls *.rpm | xargs -I '{}' alien --scripts '{}' | ||
+ | </code> | ||
+ | |||
+ | <code> | ||
+ | opencl-1.2-base_3.2.1.16712-2_amd64.deb generated | ||
+ | opencl-1.2-devel_3.2.1.16712-2_amd64.deb generated | ||
+ | opencl-1.2-intel-cpu_3.2.1.16712-2_amd64.deb generated | ||
+ | opencl-1.2-intel-devel_3.2.1.16712-2_amd64.deb generated | ||
+ | </code> | ||
+ | |||
+ | ==== Installation des paquets ==== | ||
+ | |||
+ | <code> | ||
+ | dpkg -i *.deb | ||
+ | </code> | ||
+ | |||
+ | Un lien est réalisé entre ''/opt/intel/opencl-1.2-3.2.1.16712'' et '/etc/alternatives/opencl-intel-runtime' | ||
+ | |||
+ | <code> | ||
+ | echo /etc/alternatives/opencl-intel-runtime/lib64 >> /etc/ld.so.conf.d/mic.conf | ||
+ | echo /etc/alternatives/opencl-intel-runtime/libmic >> /etc/ld.so.conf.d/mic.conf | ||
+ | ldconfig | ||
+ | |||
+ | clinfo | grep "Device Type" | ||
+ | </code> | ||
+ | |||
+ | Plus besoin d'établir les liens pour l'ICD. | ||
+ | |||
+ | 3 composants sont détectés : | ||
+ | <code> | ||
+ | Device Type: CL_DEVICE_TYPE_CPU | ||
+ | Device Type: CL_DEVICE_TYPE_CPU | ||
+ | Device Type: CL_DEVICE_TYPE_ACCRLERATOR | ||
+ | </code> | ||
+ | |||
+ | Lancement du test Python OpenCL 'python /usr/share/doc/python-pyopencl-doc/examples/benchmark-all.py' :<code> | ||
+ | ('Execution time of test without OpenCL: ', 7.768199920654297, 's') | ||
+ | =============================================================== | ||
+ | ('Platform name:', 'AMD Accelerated Parallel Processing') | ||
+ | ('Platform profile:', 'FULL_PROFILE') | ||
+ | ('Platform vendor:', 'Advanced Micro Devices, Inc.') | ||
+ | ('Platform version:', 'OpenCL 1.2 AMD-APP (938.2)') | ||
+ | --------------------------------------------------------------- | ||
+ | ('Device name:', 'Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz') | ||
+ | ('Device type:', 'CPU') | ||
+ | ('Device memory: ', 64463, 'MB') | ||
+ | ('Device max clock speed:', 1200, 'MHz') | ||
+ | ('Device compute units:', 32) | ||
+ | Execution time of test: 0.000990689 s | ||
+ | Results OK | ||
+ | =============================================================== | ||
+ | ('Platform name:', 'Intel(R) OpenCL') | ||
+ | ('Platform profile:', 'FULL_PROFILE') | ||
+ | ('Platform vendor:', 'Intel(R) Corporation') | ||
+ | ('Platform version:', 'OpenCL 1.2 LINUX') | ||
+ | --------------------------------------------------------------- | ||
+ | ('Device name:', ' Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz') | ||
+ | ('Device type:', 'CPU') | ||
+ | ('Device memory: ', 64463, 'MB') | ||
+ | ('Device max clock speed:', 2700, 'MHz') | ||
+ | ('Device compute units:', 32) | ||
+ | /usr/lib/python2.7/dist-packages/pyopencl/__init__.py:36: CompilerWarning: Non-empty compiler output encountered. Set the environment variable PYOPENCL_COMPILER_OUTPUT=1 to see more. | ||
+ | "to see more.", CompilerWarning) | ||
+ | Execution time of test: 0.00065297 s | ||
+ | Results OK | ||
+ | =============================================================== | ||
+ | ('Platform name:', 'Intel(R) OpenCL') | ||
+ | ('Platform profile:', 'FULL_PROFILE') | ||
+ | ('Platform vendor:', 'Intel(R) Corporation') | ||
+ | ('Platform version:', 'OpenCL 1.2 LINUX') | ||
+ | --------------------------------------------------------------- | ||
+ | ('Device name:', 'Intel(R) Many Integrated Core Acceleration Card') | ||
+ | ('Device type:', 'ACCELERATOR') | ||
+ | ('Device memory: ', 5772, 'MB') | ||
+ | ('Device max clock speed:', 1100, 'MHz') | ||
+ | ('Device compute units:', 240) | ||
+ | Execution time of test: 0.00400693 s | ||
+ | Results OK | ||
+ | </code> | ||
+ | ===== Historique ===== | ||
- | La première étape a été de trouver un socle pour l'accueillir de manière transitoire et réaliser les premiers tests. | ||