looking for some solutions? You are welcome.

SOLVED: Problem with Keras input/output shape handling between different layers



I am currently trying to implement attention layers to my LSTM. I know, where I can find working examples and that is not the point of my question. I ended up with this code snippet

from keras.layers import Input,LSTM,TimeDistributed,Activation,Multiply,Lambda,Dense
from keras.models import Model
import keras.backend as K
inp = Input(shape=(None, 10))
lstm = LSTM(units=50, return_sequences=True)(inp)
att = TimeDistributed(Dense(1,activation="tanh"))(lstm)
att = K.squeeze(att,axis=-1)
att = Activation("softmax")(att)
final = Multiply()([att,lstm])
final = Lambda(lambda x: K.sum(x,axis=1))(final)
out = Dense(units=28,activation="softmax")(final)
model = Model(inp,out)


When I am trying to run this, I am getting the following error

  File "...\model.py", line 56, in create_model
    out = Dense(units=28, activation="softmax")(final)
  File "C:\Users\user\AppData\Local\Continuum\anaconda3\envs\processing\lib\site-packages\keras\engine\base_layer.py", line 431, in __call__
  File "C:\Users\user\AppData\Local\Continuum\anaconda3\envs\processing\lib\site-packages\keras\layers\core.py", line 866, in build
  File "C:\Users\user\AppData\Local\Continuum\anaconda3\envs\processing\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\user\AppData\Local\Continuum\anaconda3\envs\processing\lib\site-packages\keras\engine\base_layer.py", line 249, in add_weight
    weight = K.variable(initializer(shape),
  File "C:\Users\user\AppData\Local\Continuum\anaconda3\envs\processing\lib\site-packages\keras\initializers.py", line 209, in __call__
    scale /= max(1., float(fan_in + fan_out) / 2)
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

When I debug into the __call__ of the VarianceScaling Initializer, I can see that the shape as an argument is given by (None,28), which causes the problem. When I am doing a print statement before the out layer to check the output shape of the final layer I get

>>Tensor("lambda_1/Sum:0", shape=(?, 50), dtype=float32)

as expected. So the shape in the __call__ function should actually be (50,28).


When I create a much more simple example like this

inp = Input((50,))
out = Dense(units=28, activation="relu")(inp)

I got the same shapes, but no error at all. So is the problem above a keras bug with a wrong input/output shape handling? Or am I missing something?

Posted in S.E.F
via StackOverflow & StackExchange Atomic Web Robots

No comments: