首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
社区首页 >问答首页 >TensorFlow中叠加LSTM网络的维数

TensorFlow中叠加LSTM网络的维数
EN

Stack Overflow用户
提问于 2018-07-09 13:55:52
回答 1查看 157关注 0票数 0

在回顾有关多维输入和堆叠的LSTM的许多类似问题时,我没有找到一个示例,它为initial_state占位符列出了维数,并遵循了下面的rnn_tuple_state。尝试的[lstm_num_layers, 2, None, lstm_num_cells, 2]是来自这些示例(http://monik.in/a-noobs-guide-to-implementing-rnn-lstm-using-tensorflow/https://medium.com/@erikhallstrm/using-the-tensorflow-multilayered-lstm-api-f6e7da7bbe40)的代码的扩展,在特性的每个时间步骤中为多个值添加了一个额外的feature_dim维度(这不起作用,而是由于tensorflow.nn.dynamic_rnn调用中的维度不匹配而产生ValueError )。

代码语言:javascript
运行
AI代码解释
复制
time_steps = 10
feature_dim = 2
label_dim = 4
lstm_num_layers = 3
lstm_num_cells = 100
dropout_rate = 0.8

# None is to allow for variable size batches
features = tensorflow.placeholder(tensorflow.float32,
                                  [None, time_steps, feature_dim])
labels = tensorflow.placeholder(tensorflow.float32, [None, label_dim])

cell = tensorflow.contrib.rnn.MultiRNNCell(
    [tensorflow.contrib.rnn.LayerNormBasicLSTMCell(
        lstm_num_cells,
        dropout_keep_prob = dropout_rate)] * lstm_num_layers,
    state_is_tuple = True)

# not sure of the dimensionality for the initial state
initial_state = tensorflow.placeholder(
    tensorflow.float32,
    [lstm_num_layers, 2, None, lstm_num_cells, feature_dim])
# which impacts these two lines as well
state_per_layer_list = tensorflow.unstack(initial_state, axis = 0)
rnn_tuple_state = tuple(
    [tensorflow.contrib.rnn.LSTMStateTuple(
        state_per_layer_list[i][0],
        state_per_layer_list[i][1]) for i in range(lstm_num_layers)])

# also not sure if expanding the feature dimensions is correct here
outputs, state = tensorflow.nn.dynamic_rnn(
    cell, tensorflow.expand_dims(features, -1),
    initial_state = rnn_tuple_state)

最有帮助的是对一般情况的解释:

  • 每个时间步骤都有N个值
  • 每个时间序列都有S个步骤
  • 每批都有B序列
  • 每个输出都有R值
  • 网络中有L个隐藏的LSTM层
  • 每一层都有M个节点

所以这个伪码版本应该是:

代码语言:javascript
运行
AI代码解释
复制
# B, S, N, and R are undefined values for the purpose of this question
features = tensorflow.placeholder(tensorflow.float32, [B, S, N])
labels = tensorflow.placeholder(tensorflow.float32, [B, R])
...

如果我能完成的话,我一开始就不会问这里了。提前谢谢。欢迎对相关最佳做法提出任何意见。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-07-17 07:52:01

经过多次尝试和错误之后,以下内容将产生一个堆叠的LSTM dynamic_rnn,而不考虑特性的维度性:

代码语言:javascript
运行
AI代码解释
复制
time_steps = 10
feature_dim = 2
label_dim = 4
lstm_num_layers = 3
lstm_num_cells = 100
dropout_rate = 0.8
learning_rate = 0.001

features = tensorflow.placeholder(
    tensorflow.float32, [None, time_steps, feature_dim])
labels = tensorflow.placeholder(
    tensorflow.float32, [None, label_dim])

cell_list = []
for _ in range(lstm_num_layers):
    cell_list.append(
        tensorflow.contrib.rnn.LayerNormBasicLSTMCell(lstm_num_cells,
                                                      dropout_keep_prob=dropout_rate))
cell = tensorflow.contrib.rnn.MultiRNNCell(cell_list, state_is_tuple=True)
initial_state = tensorflow.placeholder(
    tensorflow.float32, [lstm_num_layers, 2, None, lstm_num_cells])
state_per_layer_list = tensorflow.unstack(initial_state, axis=0)
rnn_tuple_state = tuple(
    [tensorflow.contrib.rnn.LSTMStateTuple(
        state_per_layer_list[i][0],
        state_per_layer_list[i][1]) for i in range(lstm_num_layers)])
state_series, last_state = tensorflow.nn.dynamic_rnn(
    cell=cell, inputs=features, initial_state=rnn_tuple_state)

hidden_layer_output = tensorflow.transpose(state_series, [1, 0, 2])
last_output = tensorflow.gather(hidden_layer_output, int(
    hidden_layer_output.get_shape()[0]) - 1)

weights = tensorflow.Variable(tensorflow.random_normal(
    [lstm_num_cells, int(labels.get_shape()[1])]))
biases = tensorflow.Variable(tensorflow.constant(
    0.0, shape=[labels.get_shape()[1]]))
predictions = tensorflow.matmul(last_output, weights) + biases
mean_squared_error = tensorflow.reduce_mean(
    tensorflow.square(predictions - labels))
minimize_error = tensorflow.train.RMSPropOptimizer(
    learning_rate).minimize(mean_squared_error)

在这个过程中,有一个兔子洞是从前面引用的例子中开始的,这些例子重塑了输出,以容纳分类器,而不是回归器(这正是我试图构建的)。由于这与特征维度无关,所以它作为此用例的通用模板。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/51254577

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档