Understanding output of model.get_weights() in Keras

13 Views Asked by At

To get a feeling of how neural nets work, I decided to train a super simple 2-1 net to add up its two inputs. Here's my code.

import tensorflow as tf
import numpy as np
import math

x_train = np.random.rand(10000,2)
y_train = x_train[:,0]+x_train[:,1]

model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(2),
    tf.keras.layers.Dense(1)
])

model.compile(optimizer='adam',
    loss=tf.keras.losses.mean_square_error,
    metrics=[]
)

model.fit(x_train, y_train, epochs=10)



for i in model.get_weights():
    print(i, '\n')


The output of the previous code is the following.

[[ 0.8512728   1.555682  ]
 [-0.04689309  1.6778996 ]]

[-0.23842156 -0.06958904]

[[0.09876956] [0.583725  ]]

[0.08260487]

The model behaves well, with an average mean square error of around 0.0001. However...

  • I don't understand why there are two weight matrices. Why isn't there only one, namely the weights from the first (input) to the second (output) layer?

  • How could I go about performing the same calculations as my model model given those weights and biases? Like, I think I knew the basic algebraic theory of neural nets but now that this first 2×2 weight matrix shows up, I'm not really sure of how it works. In particular, given an input x, and considering the weights and biases from the aforequoted output as W0, b0, W1, b1 respectively, I've tried to compute y=W1(W1x +b0) +b1 but this really isn't near the output of the net, and y also strongly disagrees with (1, 1)x (the ideal output). I'm therefore in the utter blue as to what's happening behind the scenes.

Please, can someone tell me what I'm missing here? What do W0, b0, W1 and b1 represent?

0

There are 0 best solutions below