Documentation obsolète : elle correspond à un ancien système et ne doit pas être exploitée en ce moment. Pour exploiter le Machine Learning au CBP, suivez ce lien [[ressources:miniconda4cbp|Exploitation ou Installation de Miniconda au Centre Blaise Pascal]] ====== Tensorflow ====== Un premier environnement [[https://www.tensorflow.org/|Tensorflow]] a été installé au printemps au Centre Blaise Pascal. Il exploite l'environnement Anaconda3-2019.3 installé dans le dossier ''/opt/anaconda3'' Un second environnement [[https://www.tensorflow.org/|Tensorflow]] a été installé cet automne au Centre Blaise Pascal. Il exploite l'environnement Anaconda3-2019.10 installé dans le dossier ''/opt/anaconda3-2019.10'' ===== Configuration d'exploitation ===== Deux environnements Tensorflow sont installés dans deux Anaconda3 différents : * l'environnement Tensorflow installé dans Anaconda3-2019.03 exploite une version de TensorFlow 1.12 * l'environnement Tensorflow installé dans Anaconda3-2019.10 exploite une version de TensorFlow 2.0 Pour charger l'environnement Tensorflow 1.12 Pour la charge dans le SIDUS standard du CBP:source /etc/tensorflow.init Pour charger l'environnement Tensorflow 2.0 Pour la charge dans le SIDUS standard du CBP:source /etc/tensorflow2.init Lorsqu'il est activé l'invite de commande est alors préfixée de ''(base)''. Par exemple, l'utilisateur ''einstein'' sur la machine ''ascenseur'' aura comme invite de commande :(base) einstein@ascenseur:~$ ===== Exemple pour Tensorflow 1.12 ===== L'exemple suivant, fourni par le [[https://www.tensorflow.org/|site officiel]], permet de rapidement juger du fonctionnement de l'environnement. Il nécessite le lancement de l'interpréteur python : # Python import tensorflow as tf hello = tf.constant('Hello, TensorFlow!') sess = tf.Session() print(sess.run(hello)) A l'exécution de la troisième ligne, l'environnement détecte les cartes graphiques susceptibles d'être exploitées : 2019-06-11 18:03:17.928752: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX 2019-06-11 18:03:17.938098: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1995195000 Hz 2019-06-11 18:03:17.938873: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x562f3dba5a30 executing computations on platform Host. Devices: 2019-06-11 18:03:17.938924: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): , 2019-06-11 18:03:18.167596: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-06-11 18:03:18.203546: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-06-11 18:03:18.205179: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x562f3dc62f80 executing computations on platform CUDA. Devices: 2019-06-11 18:03:18.205294: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): GeForce RTX 2080, Compute Capability 7.5 2019-06-11 18:03:18.205385: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (1): GeForce GT 730, Compute Capability 3.5 2019-06-11 18:03:18.206798: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: name: GeForce RTX 2080 major: 7 minor: 5 memoryClockRate(GHz): 1.71 pciBusID: 0000:04:00.0 totalMemory: 7.77GiB freeMemory: 7.65GiB 2019-06-11 18:03:18.207245: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties: name: GeForce GT 730 major: 3 minor: 5 memoryClockRate(GHz): 0.9015 pciBusID: 0000:03:00.0 totalMemory: 1.95GiB freeMemory: 1.90GiB 2019-06-11 18:03:18.207374: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1497] Ignoring visible gpu device (device: 1, name: GeForce GT 730, pci bus id: 0000:03:00.0, compute capability: 3.5) with Cuda multiprocessor count: 2. The minimum required count is 8. You can adjust this requirement with the env var TF_MIN_GPU_MULTIPROCESSOR_COUNT. 2019-06-11 18:03:18.207456: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0 2019-06-11 18:03:18.210563: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-06-11 18:03:18.210632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 1 2019-06-11 18:03:18.210687: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N N 2019-06-11 18:03:18.210734: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1: N N 2019-06-11 18:03:18.211630: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7439 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080, pci bus id: 0000:04:00.0, compute capability: 7.5) Dans l'exemple ci-dessus, il s'agit d'une carte d'une carte GeForce GT 730 avec 1.95GiB de RAM et d'une carte GeForce RTX 2080 avec 7.65GiB de RAM La dernière ligne permet s'assurer que la session fonctionne de manière nominal en affichant : Hello, TensorFlow! Une grande variété de [[https://www.tensorflow.org/tutorials/|tutoriels]] en ligne permettent de vérifier le bon fonctionnement. ===== Exemple pour Tensorflow 2.0 ===== L'exemple précédent ne fonctionne pas : voici un petit exemple qui fonctionne pour tester son tensorflow 2.0 # Python from __future__ import absolute_import, division, print_function, unicode_literals import tensorflow as tf # Create some tensors a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]) b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]]) c = tf.matmul(a, b) print(c) A l'exécution de la troisième ligne, l'environnement détecte les cartes graphiques susceptibles d'être exploitées : 2019-11-04 17:58:31.722039: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2019-11-04 17:58:31.749152: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-11-04 17:58:31.749646: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: Tesla V100-PCIE-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.38 pciBusID: 0000:07:00.0 2019-11-04 17:58:31.750304: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0 2019-11-04 17:58:31.752157: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0 2019-11-04 17:58:31.753851: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0 2019-11-04 17:58:31.755073: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0 2019-11-04 17:58:31.757047: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0 2019-11-04 17:58:31.758713: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0 2019-11-04 17:58:31.762588: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2019-11-04 17:58:31.762748: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-11-04 17:58:31.763286: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-11-04 17:58:31.763714: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2019-11-04 17:58:31.763967: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA 2019-11-04 17:58:31.768995: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3192495000 Hz 2019-11-04 17:58:31.769414: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55a533b0ac90 executing computations on platform Host. Devices: 2019-11-04 17:58:31.769443: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version 2019-11-04 17:58:31.769643: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-11-04 17:58:31.770095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: Tesla V100-PCIE-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.38 pciBusID: 0000:07:00.0 2019-11-04 17:58:31.770124: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0 2019-11-04 17:58:31.770139: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0 2019-11-04 17:58:31.770150: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0 2019-11-04 17:58:31.770164: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0 2019-11-04 17:58:31.770178: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0 2019-11-04 17:58:31.770190: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0 2019-11-04 17:58:31.770202: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2019-11-04 17:58:31.770292: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-11-04 17:58:31.770799: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-11-04 17:58:31.771228: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2019-11-04 17:58:31.771260: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0 2019-11-04 17:58:31.859125: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-11-04 17:58:31.859173: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0 2019-11-04 17:58:31.859183: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N 2019-11-04 17:58:31.859388: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-11-04 17:58:31.859930: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-11-04 17:58:31.860424: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-11-04 17:58:31.860871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14961 MB memory) -> physical GPU (device: 0, name: Tesla V100-PCIE-16GB, pci bus id: 0000:07:00.0, compute capability: 7.0) 2019-11-04 17:58:31.862849: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55a534edf7d0 executing computations on platform CUDA. Devices: 2019-11-04 17:58:31.862884: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Tesla V100-PCIE-16GB, Compute Capability 7.0 2019-11-04 17:58:31.863918: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0 tf.Tensor( [[22. 28.] [49. 64.]], shape=(2, 2), dtype=float32) On reconnait la carte **Tesla V100-PCIE-16GB** détectée. ===== Utilisation de TensorFlow2 pour du TensorFlow1 ===== Il est possible, comme le précise la [[https://www.tensorflow.org/guide/migrate|documentation de TensorFlow]], d'exécuter un code de l'ancienne version sur la nouvelle avec le chargement suivant : import tensorflow.compat.v1 as tf tf.disable_v2_behavior() ===== Astuces en cas de plantage incompréhensible ===== Il se peut que dans l'utilisation, des plantages apparaissent avec comme source de première erreur ''CUDNN_STATUS_INTERNAL_ERROR'' 2019-09-28 05:35:09.764756: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2019-09-28 05:35:09.766851: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR --- tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. --- Cette erreur aura cela de désarmant qu'elle ne va pas apparaître sur toutes les machines exploitables avec TensorFlow. Une solution peut être de définir une variable d'environnement ''$TF_FORCE_GPU_ALLOW_GROWTH'' à ''true'' export TF_FORCE_GPU_ALLOW_GROWTH=true Cette option (comme son nom l'indique) permet au GPU de conserver la mémoire déjà allouée. Le souci, c'est que cette allocation ne prend fin que lorsque l'exécution est terminée. L'application TensorFlow sur GPU requiert une version de GPU avec des capacités minimales pour fonctionner (compute capability de 3.5) : sur les 50 GPU accessibles au CBP, certains ne supportent pas cette capacité ===== GPU du CBP validés pour TensorFlow ===== * circuits Kepler : GTX Titan, GTX 780, GTX 780Ti, Tesla K40, Tesla K80 * circuits Maxwell : GTX 980Ti * circuits Pascal : GTX 1060, GTX 1070, GTX 1080, GTX 1080Ti, Tesla P100 * circuits Turing : RTX Titan, RTX 2080 Ti, RTX 2080 Super * circuits Volta : Tesla V100 ===== GPU du CBP invalidés pour TensorFlow ===== * circuits GT200 : Quadro FX4000, Tesla C1060 * circuits Fermi : GTX 560Ti, Quadro 4000