无法从tensorflow 1.1中的上一个会话加载int变量(Cannot load int variable from previous session in tensorflow 1.1)

我已经阅读了许多类似的问题，但却无法使其正常工作。

我的模型训练得很好，每个时代都有检查点文件。我想拥有它，所以程序可以从重新加载的纪元x继续，并且还可以在每次迭代时打印在该纪元上。我可以简单地将数据保存在检查点文件之外，但是我也想要这样做以给我信心，其他一切也正确存储。

不幸的是，当我重新启动时，epoch / global_step变量中的值始终为0。

import tensorflow as tf import numpy as np import tensorflow as tf import numpy as np # more imports def extract_number(f): # used to get latest checkpint file s = re.findall("epoch(\d+).ckpt",f) return (int(s[0]) if s else -1,f) def restore(init_op, sess, saver): # called to restore or just initialise model list = glob(os.path.join("./params/e*")) if list: file = max(list,key=extract_number) saver.restore(sess, file[:-5]) sess.run(init_op) return with tf.Graph().as_default() as g: # build models total_batch = data.train.num_examples / batch_size epochLimit = 51 saver = tf.train.Saver() init_op = tf.global_variables_initializer() with tf.Session() as sess: saver = tf.train.Saver() init_op = tf.global_variables_initializer() restore(init_op, sess, saver) epoch = global_step.eval() while epoch < epochLimit: total_batch = data.train.num_examples / batch_size for i in range(int(total_batch)): sys.stdout.flush() voxels = newData.eval() batch_z = np.random.uniform(-1, 1, [batch_size, z_size]).astype(np.float32) sess.run(opt_G, feed_dict={z:batch_z, train:True}) sess.run(opt_D, feed_dict={input:voxels, z:batch_z, train:True}) with open("out/loss.csv", 'a') as f: batch_loss_G = sess.run(loss_G, feed_dict={z:batch_z, train:False}) batch_loss_D = sess.run(loss_D, feed_dict={input:voxels, z:batch_z, train:False}) msgOut = "Epoch: [{0}], i: [{1}], G_Loss[{2:.8f}], D_Loss[{3:.8f}]".format(epoch, i, batch_loss_G, batch_loss_D) print(msgOut) epoch=epoch+1 sess.run(global_step.assign(epoch)) saver.save(sess, "params/epoch{0}.ckpt".format(epoch)) batch_z = np.random.uniform(-1, 1, [batch_size, z_size]).astype(np.float32) voxels = sess.run(x_, feed_dict={z:batch_z}) v = voxels[0].reshape([32, 32, 32]) > 0 util.save_binvox(v, "out/epoch{0}.vox".format(epoch), 32)

我还使用底部的assign更新了全局步骤变量。有任何想法吗？任何帮助将不胜感激。

I have read many similar questions and just cannot get this to work properly.

I have my model being trained well and checkpoint files are being made every epoch. I want to have it so the program can continue from epoch x once reloaded and also for it to print that is on that epoch with every iteration. I could simply save the data outside of the checkpoint file, however I was also wanting to do this to give me confidence everything else is also being stored properly.

Unfortunately the value in the epoch/global_step variable is always still 0 when I restart.

I also update the global step variable using assign at the bottom. Any ideas? Any help would be greatly appreciated.

最满意答案

在恢复后调用sess.run(init_op) ，会将所有变量重置为其初始值。注释掉线，事情应该有效。

My original code was wrong for several reasons because I was trying so many things. The first responder Alexandre Passos gives a valid point, but I believe what changed the game was also the use of scopes (maybe?).

Below is the working updated code if it helps anyone:

import tensorflow as tf import numpy as np # more imports def extract_number(f): # used to get latest checkpint file s = re.findall("epoch(\d+).ckpt",f) return (int(s[0]) if s else -1,f) def restore(sess, saver): # called to restore or just initialise model list = glob(os.path.join("./params/e*")) if list: file = max(list,key=extract_number) saver.restore(sess, file[:-5]) return saver, True, sess saver = tf.train.Saver() init_op = tf.global_variables_initializer() sess.run(init_op) return saver, False , sess batch_size = 100 learning_rate = 0.0001 beta1 = 0.5 z_size = 100 save_interval = 1 data = dataset.read() total_batch = data.train.num_examples / batch_size def fill_queue(): for i in range(int(total_batch*epochLimit)): sess.run(enqueue_op, feed_dict={batch: data.train.next_batch(batch_size)}) # runnig in seperate thread to feed a FIFOqueue with tf.variable_scope("glob"): global_step = tf.get_variable(name='global_step', initializer=0,trainable=False) # build models epochLimit = 51 saver = tf.train.Saver() with tf.Session() as sess: saver,rstr,sess = restore(sess, saver) with tf.variable_scope("glob", reuse=True): epocht = tf.get_variable(name='global_step', trainable=False, dtype=tf.int32) epoch = epocht.eval() while epoch < epochLimit: total_batch = data.train.num_examples / batch_size for i in range(int(total_batch)): sys.stdout.flush() voxels = newData.eval() batch_z = np.random.uniform(-1, 1, [batch_size, z_size]).astype(np.float32) sess.run(opt_G, feed_dict={z:batch_z, train:True}) sess.run(opt_D, feed_dict={input:voxels, z:batch_z, train:True}) with open("out/loss.csv", 'a') as f: batch_loss_G = sess.run(loss_G, feed_dict={z:batch_z, train:False}) batch_loss_D = sess.run(loss_D, feed_dict={input:voxels, z:batch_z, train:False}) msgOut = "Epoch: [{0}], i: [{1}], G_Loss[{2:.8f}], D_Loss[{3:.8f}]".format(epoch, i, batch_loss_G, batch_loss_D) print(msgOut) epoch=epoch+1 sess.run(global_step.assign(epoch)) saver.save(sess, "params/epoch{0}.ckpt".format(epoch)) batch_z = np.random.uniform(-1, 1, [batch_size, z_size]).astype(np.float32) voxels = sess.run(x_, feed_dict={z:batch_z}) v = voxels[0].reshape([32, 32, 32]) > 0 util.save_binvox(v, "out/epoch{0}.vox".format(epoch), 32)无法从tensorflow 1.1中的上一个会话加载int变量(Cannot load int variable from previous session in tensorflow 1.1)

我已经阅读了许多类似的问题，但却无法使其正常工作。

不幸的是，当我重新启动时，epoch / global_step变量中的值始终为0。

我还使用底部的assign更新了全局步骤变量。有任何想法吗？任何帮助将不胜感激。

I have read many similar questions and just cannot get this to work properly.

Unfortunately the value in the epoch/global_step variable is always still 0 when I restart.

I also update the global step variable using assign at the bottom. Any ideas? Any help would be greatly appreciated.

最满意答案

在恢复后调用sess.run(init_op) ，会将所有变量重置为其初始值。注释掉线，事情应该有效。

Below is the working updated code if it helps anyone:

无法从tensorflow 1.1中的上一个会话加载int变量(Cannot load int variable from previous session in tensorflow 1.1)

最满意答案

最满意答案

发布评论取消回复

最近发表

相关推荐

标签列表

无法从tensorflow 1.1中的上一个会话加载int变量(Cannot load int variable from previous session in tensorflow 1.1)

最满意答案

最满意答案

发布评论 取消回复

最近发表

相关推荐

标签列表

发布评论取消回复