如果一個 TensorFlow 的 operation 中兼有 CPU 和 GPU 的實現(xiàn), 當這個算子被指派設(shè)備時, GPU 有優(yōu)先權(quán). 比如matmul中 CPU 和 GPU kernel 函數(shù)都存在. 那么在 cpu:0 和 gpu:0 中, matmul operation 會被指派給 gpu:0 .

記錄設(shè)備指派情況

為了獲取你的 operations 和 Tensor 被指派到哪個設(shè)備上運行, 用 log_device_placement 新建一個 session, 并設(shè)置為 True.

# 新建一個 graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# 新建session with log_device_placement并設(shè)置為True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# 運行這個 op.
print sess.run(c)

你應(yīng)該能看見以下輸出:

Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus
id: 0000:05:00.0
b: /job:localhost/replica:0/task:0/gpu:0
a: /job:localhost/replica:0/task:0/gpu:0
MatMul: /job:localhost/replica:0/task:0/gpu:0
[[ 22.  28.]
 [ 49.  64.]]

手工指派設(shè)備

如果你不想使用系統(tǒng)來為 operation 指派設(shè)備, 而是手工指派設(shè)備, 你可以用 with tf.device 創(chuàng)建一個設(shè)備環(huán)境, 這個環(huán)境下的 operation 都統(tǒng)一運行在環(huán)境指定的設(shè)備上.

# 新建一個graph.
with tf.device('/cpu:0'):
  a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
  b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# 新建session with log_device_placement并設(shè)置為True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# 運行這個op.
print sess.run(c)

你會發(fā)現(xiàn)現(xiàn)在 a 和 b 操作都被指派給了 cpu:0.

Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus
id: 0000:05:00.0
b: /job:localhost/replica:0/task:0/cpu:0
a: /job:localhost/replica:0/task:0/cpu:0
MatMul: /job:localhost/replica:0/task:0/gpu:0
[[ 22.  28.]
 [ 49.  64.]]

在多GPU系統(tǒng)里使用單一GPU

如果你的系統(tǒng)里有多個 GPU, 那么 ID 最小的 GPU 會默認使用. 如果你想用別的 GPU, 可以用下面的方法顯式的聲明你的偏好:

# 新建一個 graph.
with tf.device('/gpu:2'):
  a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
  b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
  c = tf.matmul(a, b)
# 新建 session with log_device_placement 并設(shè)置為 True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# 運行這個 op.
print sess.run(c)

如果你指定的設(shè)備不存在, 你會收到 InvalidArgumentError 錯誤提示:

InvalidArgumentError: Invalid argument: Cannot assign a device to node 'b':
Could not satisfy explicit device specification '/gpu:2'
   [[Node: b = Const[dtype=DT_FLOAT, value=Tensor<type: float shape: [3,2]
   values: 1 2 3...>, _device="/gpu:2"]()]]

為了避免出現(xiàn)你指定的設(shè)備不存在這種情況, 你可以在創(chuàng)建的 session 里把參數(shù) allow_soft_placement 設(shè)置為 True, 這樣 tensorFlow 會自動選擇一個存在并且支持的設(shè)備來運行 operation.

# 新建一個 graph.
with tf.device('/gpu:2'):
  a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
  b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
  c = tf.matmul(a, b)
# 新建 session with log_device_placement 并設(shè)置為 True.
sess = tf.Session(config=tf.ConfigProto(
      allow_soft_placement=True, log_device_placement=True))
# 運行這個 op.
print sess.run(c)

使用多個 GPU

如果你想讓 TensorFlow 在多個 GPU 上運行, 你可以建立 multi-tower 結(jié)構(gòu), 在這個結(jié)構(gòu) 里每個 tower 分別被指配給不同的 GPU 運行. 比如:

# 新建一個 graph.
c = []
for d in ['/gpu:2', '/gpu:3']:
  with tf.device(d):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3])
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2])
    c.append(tf.matmul(a, b))
with tf.device('/cpu:0'):
  sum = tf.add_n(c)
# 新建session with log_device_placement并設(shè)置為True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# 運行這個op.
print sess.run(sum)

你會看到如下輸出:

Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K20m, pci bus
id: 0000:02:00.0
/job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: Tesla K20m, pci bus
id: 0000:03:00.0
/job:localhost/replica:0/task:0/gpu:2 -> device: 2, name: Tesla K20m, pci bus
id: 0000:83:00.0
/job:localhost/replica:0/task:0/gpu:3 -> device: 3, name: Tesla K20m, pci bus
id: 0000:84:00.0
Const_3: /job:localhost/replica:0/task:0/gpu:3
Const_2: /job:localhost/replica:0/task:0/gpu:3
MatMul_1: /job:localhost/replica:0/task:0/gpu:3
Const_1: /job:localhost/replica:0/task:0/gpu:2
Const: /job:localhost/replica:0/task:0/gpu:2
MatMul: /job:localhost/replica:0/task:0/gpu:2
AddN: /job:localhost/replica:0/task:0/cpu:0
[[  44.   56.]
 [  98.  128.]]

cifar10 tutorial 這個例子很好的演示了怎樣用GPU集群訓(xùn)練.

原文:using_gpu 翻譯:@lianghyv 校對:Wiki

上一篇：偏微分方程 <a class="md-anchor" id="AUTOGENERATED-partial-differentia下一篇：常見問題 <a class="md-anchor" id="AUTOGENERATED-frequently-asked-que

在线观看不卡亚洲电影_亚洲妓女99综合网_91青青青亚洲娱乐在线观看_日韩无码高清综合久久

使用 GPUs <a class="md-anchor" id="AUTOGENERATED-using-gpus"></a>

支持的設(shè)備

記錄設(shè)備指派情況

手工指派設(shè)備

在多GPU系統(tǒng)里使用單一GPU

使用多個 GPU