我想要创建一个没有隐藏层的神经网络,但是每个输入都有一个偏差求和,并且在进入softmax输出层之前经过一个relu函数。与每个输入的偏差相关的权重需要是可训练的。您也可以将其看作是一个具有一个隐藏层的神经网络,但是隐藏层中的每个节点只连接到一个输入特性。我想要实现的是最简单的架构,它可以为每个输入学习一个阈值函数(这是通过对relu的偏倚进行组合来实现的)。然后将这些阈值输入中的每一个求和到输出节点中,输出节点使用softmax进行多类分类。我确实考虑过添加一个密集连接的隐藏层,然后添加一个正则化函数,将每个节点的所有权重设置为零--但是这种方法的问题是,它仍然会尝试在每次更新之后训练所有被设置为零的权重:除了效率低下之外,这会不会干扰对未设置为零的权重的培训?我知道Keras会自动向输出层添加偏差(这很好)。
下面是我在TensorFlow中的代码:
inputs = tf.keras.layers.Input(shape=(input_dim,))
outputs = tf.keras.layers.Dense(output_dim, activation='softmax')(inputs)
model = tf.keras.models.Model(inputs=inputs, outputs=outputs)发布于 2020-04-21 22:55:46
您的想法在Keras中将是非常低效的,因为大多数现代库都关注基于乘法的权重矩阵。您可以编写自定义层,也可以使用一些黑客。
总之,假设您有带有n维度的输入,您希望在每个输入中添加一个偏差,应用relu并以这种方式训练它。
一种麻木不仁的方法是使用中间分支。
from tensorflow.keras.layers import Input, Dense, Add, Activation
from tensorflow.keras.models import Model
n = 3
ip = Input(shape=(n))
# branch 1
d1 = Dense(n, trainable = False, use_bias = False, kernel_initializer = 'zeros')(ip)
d2 = Dense(n, trainable = True, use_bias = True)(d1)
d2 = Activation('relu')(d2)
# branch 2
add = Add()([ip, d2])
act = Activation('softmax')(add)
model = Model(ip, act)
model.summary()Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 3)] 0
__________________________________________________________________________________________________
dense (Dense) (None, 3) 9 input_1[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 3) 12 dense[0][0]
__________________________________________________________________________________________________
add (Add) (None, 3) 0 input_1[0][0]
dense_1[0][0]
__________________________________________________________________________________________________
activation (Activation) (None, 3) 0 add[0][0]
==================================================================================================
Total params: 21
Trainable params: 12
Non-trainable params: 9发布于 2020-04-22 08:41:48
在研究了Zabir Al Nazi提出的非常有用的解决方案之后,我提出了这一修改,以便将relu激活函数应用于偏差和输入之和(而不仅仅适用于偏置):
n = 3
ip = Input(shape=(n))
# branch 1
d1 = Dense(n, trainable = False, use_bias = False, kernel_initializer = 'zeros')(ip)
d2 = Dense(n, trainable = True, use_bias = True)(d1)
# branch 2
add = Add()([ip, d2])
add = Activation('relu')(add)
act = Activation('softmax')(add)
model = Model(ip, act)
model.summary()
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 3)] 0
__________________________________________________________________________________________________
dense (Dense) (None, 3) 9 input_1[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 3) 12 dense[0][0]
__________________________________________________________________________________________________
add (Add) (None, 3) 0 input_1[0][0]
dense_1[0][0]
__________________________________________________________________________________________________
activation (Activation) (None, 3) 0 add[0][0]
__________________________________________________________________________________________________
activation_1 (Activation) (None, 3) 0 activation[0][0]
==================================================================================================
Total params: 21
Trainable params: 12
Non-trainable params: 9https://stackoverflow.com/questions/61352737
复制相似问题