Tensorflow

Un premier environnement Tensorflow a été installé au printemps au Centre Blaise Pascal. Il exploite l'environnement Anaconda3-2019.3 installé dans le dossier /opt/anaconda3

Un second environnement Tensorflow a été installé cet automne au Centre Blaise Pascal. Il exploite l'environnement Anaconda3-2019.10 installé dans le dossier /opt/anaconda3-2019.10

Configuration d'exploitation

Deux environnements Tensorflow sont installés dans deux Anaconda3 différents :

  • l'environnement Tensorflow installé dans Anaconda3-2019.03 exploite une version de TensorFlow 1.12
  • l'environnement Tensorflow installé dans Anaconda3-2019.10 exploite une version de TensorFlow 2.0

Pour charger l'environnement Tensorflow 1.12 Pour la charge dans le SIDUS standard du CBP:

source /etc/tensorflow.init

Pour charger l'environnement Tensorflow 2.0 Pour la charge dans le SIDUS standard du CBP:

source /etc/tensorflow2.init

Lorsqu'il est activé l'invite de commande est alors préfixée de (base). Par exemple, l'utilisateur einstein sur la machine ascenseur aura comme invite de commande :

(base) einstein@ascenseur:~$

Exemple pour Tensorflow 1.12

L'exemple suivant, fourni par le site officiel, permet de rapidement juger du fonctionnement de l'environnement. Il nécessite le lancement de l'interpréteur python :

# Python
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

A l'exécution de la troisième ligne, l'environnement détecte les cartes graphiques susceptibles d'être exploitées :

2019-06-11 18:03:17.928752: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
2019-06-11 18:03:17.938098: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1995195000 Hz
2019-06-11 18:03:17.938873: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x562f3dba5a30 executing computations on platform Host. Devices:
2019-06-11 18:03:17.938924: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-06-11 18:03:18.167596: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-11 18:03:18.203546: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-11 18:03:18.205179: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x562f3dc62f80 executing computations on platform CUDA. Devices:
2019-06-11 18:03:18.205294: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): GeForce RTX 2080, Compute Capability 7.5
2019-06-11 18:03:18.205385: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (1): GeForce GT 730, Compute Capability 3.5
2019-06-11 18:03:18.206798: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: GeForce RTX 2080 major: 7 minor: 5 memoryClockRate(GHz): 1.71
pciBusID: 0000:04:00.0
totalMemory: 7.77GiB freeMemory: 7.65GiB
2019-06-11 18:03:18.207245: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties: 
name: GeForce GT 730 major: 3 minor: 5 memoryClockRate(GHz): 0.9015
pciBusID: 0000:03:00.0
totalMemory: 1.95GiB freeMemory: 1.90GiB
2019-06-11 18:03:18.207374: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1497] Ignoring visible gpu device (device: 1, name: GeForce GT 730, pci bus id: 0000:03:00.0, compute capability: 3.5) with Cuda multiprocessor count: 2. The minimum required count is 8. You can adjust this requirement with the env var TF_MIN_GPU_MULTIPROCESSOR_COUNT.
2019-06-11 18:03:18.207456: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-11 18:03:18.210563: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-11 18:03:18.210632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 1 
2019-06-11 18:03:18.210687: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N N 
2019-06-11 18:03:18.210734: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1:   N N 
2019-06-11 18:03:18.211630: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7439 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080, pci bus id: 0000:04:00.0, compute capability: 7.5)

Dans l'exemple ci-dessus, il s'agit d'une carte d'une carte GeForce GT 730 avec 1.95GiB de RAM et d'une carte GeForce RTX 2080 avec 7.65GiB de RAM

La dernière ligne permet s'assurer que la session fonctionne de manière nominal en affichant :

Hello, TensorFlow!

Une grande variété de tutoriels en ligne permettent de vérifier le bon fonctionnement.

Exemple pour Tensorflow 2.0

L'exemple précédent ne fonctionne pas : voici un petit exemple qui fonctionne pour tester son tensorflow 2.0

# Python
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
# Create some tensors 
a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]) 
b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]]) 
c = tf.matmul(a, b)
print(c)

A l'exécution de la troisième ligne, l'environnement détecte les cartes graphiques susceptibles d'être exploitées :

2019-11-04 17:58:31.722039: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2019-11-04 17:58:31.749152: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-04 17:58:31.749646: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: Tesla V100-PCIE-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.38
pciBusID: 0000:07:00.0
2019-11-04 17:58:31.750304: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-11-04 17:58:31.752157: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-11-04 17:58:31.753851: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-11-04 17:58:31.755073: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-11-04 17:58:31.757047: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-11-04 17:58:31.758713: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-11-04 17:58:31.762588: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-11-04 17:58:31.762748: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-04 17:58:31.763286: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-04 17:58:31.763714: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-11-04 17:58:31.763967: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
2019-11-04 17:58:31.768995: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3192495000 Hz
2019-11-04 17:58:31.769414: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55a533b0ac90 executing computations on platform Host. Devices:
2019-11-04 17:58:31.769443: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
2019-11-04 17:58:31.769643: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-04 17:58:31.770095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: Tesla V100-PCIE-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.38
pciBusID: 0000:07:00.0
2019-11-04 17:58:31.770124: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-11-04 17:58:31.770139: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-11-04 17:58:31.770150: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-11-04 17:58:31.770164: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-11-04 17:58:31.770178: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-11-04 17:58:31.770190: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-11-04 17:58:31.770202: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-11-04 17:58:31.770292: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-04 17:58:31.770799: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-04 17:58:31.771228: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-11-04 17:58:31.771260: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-11-04 17:58:31.859125: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-04 17:58:31.859173: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2019-11-04 17:58:31.859183: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2019-11-04 17:58:31.859388: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-04 17:58:31.859930: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-04 17:58:31.860424: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-04 17:58:31.860871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14961 MB memory) -> physical GPU (device: 0, name: Tesla V100-PCIE-16GB, pci bus id: 0000:07:00.0, compute capability: 7.0)
2019-11-04 17:58:31.862849: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55a534edf7d0 executing computations on platform CUDA. Devices:
2019-11-04 17:58:31.862884: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Tesla V100-PCIE-16GB, Compute Capability 7.0
2019-11-04 17:58:31.863918: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
tf.Tensor(
[[22. 28.]
 [49. 64.]], shape=(2, 2), dtype=float32)

On reconnait la carte Tesla V100-PCIE-16GB détectée.

Utilisation de TensorFlow2 pour du TensorFlow1

Il est possible, comme le précise la documentation de TensorFlow, d'exécuter un code de l'ancienne version sur la nouvelle avec le chargement suivant :

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

Astuces en cas de plantage incompréhensible

Il se peut que dans l'utilisation, des plantages apparaissent avec comme source de première erreur CUDNN_STATUS_INTERNAL_ERROR

2019-09-28 05:35:09.764756: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-09-28 05:35:09.766851: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
---
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
---

Cette erreur aura cela de désarmant qu'elle ne va pas apparaître sur toutes les machines exploitables avec TensorFlow.

Une solution peut être de définir une variable d'environnement $TF_FORCE_GPU_ALLOW_GROWTH à true

export TF_FORCE_GPU_ALLOW_GROWTH=true

Cette option (comme son nom l'indique) permet au GPU de conserver la mémoire déjà allouée. Le souci, c'est que cette allocation ne prend fin que lorsque l'exécution est terminée.

L'application TensorFlow sur GPU requiert une version de GPU avec des capacités minimales pour fonctionner (compute capability de 3.5) : sur les 50 GPU accessibles au CBP, certains ne supportent pas cette capacité

GPU du CBP validés pour TensorFlow

  • circuits Kepler : GTX Titan, GTX 780, GTX 780Ti, Tesla K40, Tesla K80
  • circuits Maxwell : GTX 980Ti
  • circuits Pascal : GTX 1060, GTX 1070, GTX 1080, GTX 1080Ti, Tesla P100
  • circuits Turing : RTX Titan, RTX 2080 Ti, RTX 2080 Super
  • circuits Volta : Tesla V100

GPU du CBP invalidés pour TensorFlow

  • circuits GT200 : Quadro FX4000, Tesla C1060
  • circuits Fermi : GTX 560Ti, Quadro 4000
ressources/ressources/tensorflow.txt · Dernière modification: 2019/11/07 16:43 par equemene