Putting the weight matrices together

Let's determine first. is a 4 x 1 matrix representing the letter w, which we defined earlier. The standard rules of matrix multiplication apply here:

Now we will calculate the term . We will shortly see the significance of the bias term. Since w is the first letter that we are feeding to the network, it does not have any previous state therefore, we will take as a matrix of 3 x 1 consisting of zeros:

Note that if we didn't take the bias term, we would have got a matrix consisting of only zeros. We will now add these two matrices as per the equation (1). The result of this addition is a 3 x 1 matrix and is stored in (which in this case is ):

Following the equation (1), all we need to do is apply the activation function to this matrix.