Exercise Sheet 5: Perceptrons and Hopfield networks

Younesse Kaddar

\[\newcommand{\T}{ {\raise{0.7ex}{\intercal}}} \newcommand{\sign}{\textrm{sign}}\]

1. Peceptrons

Consider the following two sets of two-dimensional patterns:

\[𝒟_{\text{blue}} = \lbrace (1, 2), (1, 3), (3, 4), (2, 4) \rbrace\\ 𝒟_{\text{red}} = \lbrace (4, 0), (6, 2), (2, -3), (6, -2) \rbrace\]

a. Are the two sets linearly separable? If they are, use a preceptron rule to find a readout weight vector that allows a binary neuron to classify the two sets in two different categories

The two sets are easily seen to be separable, by the line whose equation is $y = x - 1.5$ for example, as shown in the figure below:

*Figure 1.a.1.* - Plot of the 2D-data points, along with a handwritten separating line of equation $y = x - 1.5$

The algorithm using the perceptron rule is the following one (in pseudo-code):

n_trials = 8

# Color of the points: 0 ≡ blue / 1 ≡ red
Y → [0, 0, 0, 0, 1, 1, 1, 1]

# Data points
X → [[1,2],[1,3],[3,4],[2,4],[4,0],[6,2],[2,-3],[6,-2]]

# Initialize the weight vector
w → (1, 1)

while True:
    # is the weight still changing? To end the loop
    still_changing → False

    for 1 ≤ k ≤ nb_trials:
        # output = 1 if the scalar product is > 0, else 0
        output → H(⟨w, X[k]⟩)

        w_new → w + (Y[k]-output)*X[k]

        if w_new != w:
            still_changing = True

        w → w_new

    if not still_changing:
        break

return w

Let us run it on our 2D-data set:

Step	$w$	$k$	$x^{(k)}$	$y^{(k)}$	$⟨w, x^{(k)}⟩$	`output`	$y_k$ - `output`	$w_{\text{new}}$
	$(1, 1)$
$1$	$(1, 1)$	$1$	$(1, 2)$	$0$	$1 \cdot 1+ 1 \cdot 2\ = 3$	$1$	$-1$	$(1, 1)-(1, 2)\ = (0, -1)$
$1$	$(0, -1)$	$2$	$(1, 3)$	$0$	$0 \cdot 1 - 1 \cdot 3\ = -3$	$0$	$0$	$(0, -1)$
$1$	$(0, -1)$	$3$	$(3, 4)$	$0$	$0 \cdot 3 - 1 \cdot 4\ = -4$	$0$	$0$	$(0, -1)$
$1$	$(0, -1)$	$4$	$(2, 4)$	$0$	$0 \cdot 2 - 1 \cdot 4\ = -4$	$0$	$0$	$(0, -1)$
$1$	$(0, -1)$	$5$	$(4, 0)$	$1$	$0 \cdot 4 - 1 \cdot 0\ = 0$	$1$	$0$	$(0, -1)$
$1$	$(0, -1)$	$6$	$(6, 2)$	$1$	$0 \cdot 6 - 1 \cdot 2\ = -2$	$0$	$1$	$(0, -1)+(6, 2)\=(6, 1)$
$1$	$(6, 1)$	$7$	$(2, -3)$	$1$	$6 \cdot 2 - 1 \cdot 3\= 9$	$1$	$0$	$(6, 1)$
$1$	$(6, 1)$	$8$	$(6, -2)$	$1$	$6 \cdot 6 - 1 \cdot 2\= 34$	$1$	$0$	$(6, 1)$
$2$	$(6, 1)$	$9$	$(1, 2)$	$0$	$6 \cdot 1 + 1 \cdot 2\= 8$	$1$	$-1$	$(5, -1)$
$2$	$(5, -1)$	$10$	$(1, 3)$	$0$	$5 \cdot 1 - 1 \cdot 3\= 2$	$1$	$-1$	$(5, -1)-(1, 3)\=(4, -4)$
$2$	$(4, -4)$	$11$	$(3, 4)$	$0$	$4 \cdot 3 - 4 \cdot 4\= -4$	$0$	$0$	$(4, -4)$
$2$	$(4, -4)$	$12$	$(2, 4)$	$0$	$4 \cdot 2 - 4 \cdot 4\= -8$	$0$	$0$	$(4, -4)$
$2$	$(4, -4)$	$13$	$(4, 0)$	$1$	$4 \cdot 4 - 4 \cdot 0\= 16$	$1$	$0$	$(4, -4)$
$2$	$(4, -4)$	$14$	$(6, 2)$	$1$	$4 \cdot 6 - 4 \cdot 2\= 16$	$1$	$0$	$(4, -4)$
$2$	$(4, -4)$	$15$	$(2, -3)$	$1$	$4 \cdot 2 + 4 \cdot 3\= 20$	$1$	$0$	$(4, -4)$
$2$	$(4, -4)$	$16$	$(6, -2)$	$1$	$4 \cdot 6 + 4 \cdot 2\= 32$	$1$	$0$	$(4, -4)$
$3$	$(4, -4)$	$17$	$(1, 2)$	$0$	$4 \cdot 1 - 4 \cdot 2\= -4$	$0$	$0$	$(4, -4)$
$3$	$(4, -4)$	$18$	$(1, 3)$	$0$	$4 \cdot 1 - 4 \cdot 3\= -8$	$0$	$0$	$(4, -4)$
$3$	$(4, -4)$	$19$	$(3, 4)$	$0$	$4 \cdot 3 - 4 \cdot 4\= -4$	$0$	$0$	$(4, -4)$
$3$	$(4, -4)$	$20$	$(2, 4)$	$0$	$4 \cdot 2 - 4 \cdot 4\= -8$	$0$	$0$	$(4, -4)$
$3$	$(4, -4)$	$21$	$(4, 0)$	$1$	$4 \cdot 4 - 4 \cdot 0\= 16$	$1$	$0$	$(4, -4)$
$3$	$(4, -4)$	$22$	$(6, 2)$	$1$	$4 \cdot 6 - 4 \cdot 2\= 16$	$1$	$0$	$(4, -4)$
$3$	$(4, -4)$	$23$	$(2, -3)$	$1$	$4 \cdot 2 + 4 \cdot 3\= 20$	$1$	$0$	$(4, -4)$
$3$	$(4, -4)$	$24$	$(6, -2)$	$1$	$4 \cdot 6 + 4 \cdot 2\= 32$	$1$	$0$	$(4, -4)$

The algorithm stops at the end of the third step because the readout weight vector $w ≝ (4, -4)$ hasn’t changed throughout this step: all the points have finally been correctly classified.

Therefore: the decision boundary turns out to the line given by the equation:

\[\underbrace{w_0}_{≝ 4} x + \underbrace{w_1}_{≝ -4} y = 0 ⟺ y = x\]

which can easily be checked on the data points: a point $(x, y)$ is classified as red (resp. blue) if and only if $x > y$ (resp. $x ≤ y$).

Here is the resulting separating line (in green), along with the normalized readout weight vector $w$ (in purple):

*Figure 1.a.2* - Plot of the decision boundary $y = x$ (in green) for the two 2D-data sets, along with the normalized readout vector (in purple)

b. Repeat the exercise with the following two sets of three-dimensional patterns:

\[𝒟_{\text{blue}} = \lbrace (2, 1, 2), (3, 2, 3), (4, 2, 4), (3, 1, 2) \rbrace\\ 𝒟_{\text{red}} = \lbrace (1, 4, -1), (2, 3, 0), (1, 2, 0), (0, 1, 0) \rbrace\]

Again, the data sets are seen to be linearly separable (with the $z = 1$ plane for instance):

*Figure 1.b.1.* - Plot of the 3D-data points, along with a handwritten separating plane of equation $z = 1$

This time, the values are initialized as follows:

n_trials = 8

# Color of the points: 0 ≡ blue / 1 ≡ red
Y → [0, 0, 0, 0, 1, 1, 1, 1]

# Data points
X → [[2,1,2],[3,2,3],[4,2,4],[3,1,2],[1,4,-1],[2,3,0],[1,2,0],[0,1,0]]

# Initialize the weight vector
w → (1, 1, 1)

Let us run the algorithm on this 3D-data set:

Step	$w$	$k$	$x^{(k)}$	$y^{(k)}$	$⟨w, x^{(k)}⟩$	`output`	$y_k$ - `output`	$w_{\text{new}}$
	$(1, 1, 1)$
$1$	$(1, 1, 1)$	$1$	$(2, 1, 2)$	$0$	$1 \cdot 2 + 1 \cdot 1 + 1 \cdot 2 \ = 5$	$1$	$-1$	$(1, 1, 1)-(2, 1, 2)\ = (-1, 0, -1)$
$1$	$(-1, 0, -1)$	$2$	$(3, 2, 3)$	$0$	$-1 \cdot 3 + 0 \cdot 2 - 1 \cdot 3\ = -6$	$0$	$0$	$(-1, 0, -1)$
$1$	$(-1, 0, -1)$	$3$	$(4, 2, 4)$	$0$	$-1 \cdot 4 + 0 \cdot 2 - 1 \cdot 4\ = -8$	$0$	$0$	$(-1, 0, -1)$
$1$	$(-1, 0, -1)$	$4$	$(3, 1, 2)$	$0$	$-1\cdot 3 + 0 \cdot 1 - 1 \cdot 2\ = -5$	$0$	$0$	$(-1, 0, -1)$
$1$	$(-1, 0, -1)$	$5$	$(1, 4, -1)$	$1$	$-1 \cdot 1 + 0 \cdot 4 + 1 \cdot 1\ = 0$	$1$	$0$	$(-1, 0, -1)$
$1$	$(-1, 0, -1)$	$6$	$(2, 3, 0)$	$1$	$-1 \cdot 2 + 0 \cdot 3 - 1 \cdot 0\ = -2$	$0$	$1$	$(-1, 0, -1)+(2, 3, 0)\=(1, 3, -1)$
$1$	$(1, 3, -1)$	$7$	$(1, 2, 0)$	$1$	$1 \cdot 1 + 3 \cdot 2 - 1 \cdot 0\= 7$	$1$	$0$	$(1, 3, -1)$
$1$	$(1, 3, -1)$	$8$	$(0, 1, 0)$	$1$	$1 \cdot 0 + 3 \cdot 1 - 1 \cdot 0\= 3$	$1$	$0$	$(1, 3, -1)$
$2$	$(1, 3, -1)$	$9$	$(2, 1, 2)$	$0$	$1 \cdot 2 + 3 \cdot 1 - 1 \cdot 2\= 3$	$1$	$-1$	$(1, 3, -1) - (2, 1, 2)\= (-1, 2, -3)$
$2$	$(-1, 2, -3)$	$10$	$(3, 2, 3)$	$0$	$-1 \cdot 3 + 2 \cdot 2 - 3 \cdot 3\= -8$	$0$	$0$	$(-1, 2, -3)$
$2$	$(-1, 2, -3)$	$11$	$(4, 2, 4)$	$0$	$-1 \cdot 4 + 2 \cdot 2 - 3 \cdot 4\= -12$	$0$	$0$	$(-1, 2, -3)$
$2$	$(-1, 2, -3)$	$12$	$(3, 1, 2)$	$0$	$-1 \cdot 3 + 2 \cdot 1 - 3 \cdot 2\= -7$	$0$	$0$	$(-1, 2, -3)$
$2$	$(-1, 2, -3)$	$13$	$(1, 4, -1)$	$1$	$-1 \cdot 1 + 2 \cdot 4 + 3 \cdot 1\= 10$	$1$	$0$	$(-1, 2, -3)$
$2$	$(-1, 2, -3)$	$14$	$(2, 3, 0)$	$1$	$-1 \cdot 2 + 2 \cdot 3 - 3 \cdot 0\= 4$	$1$	$0$	$(-1, 2, -3)$
$2$	$(-1, 2, -3)$	$15$	$(1, 2, 0)$	$1$	$-1 \cdot 1 + 2 \cdot 2 - 3 \cdot 0\= 3$	$1$	$0$	$(-1, 2, -3)$
$2$	$(-1, 2, -3)$	$16$	$(0, 1, 0)$	$1$	$-1 \cdot 0 + 2 \cdot 1 - 3 \cdot 0\= 2$	$1$	$0$	$(-1, 2, -3)$
$3$	$(-1, 2, -3)$	$17$	$(2, 1, 2)$	$0$	$-1 \cdot 2 + 2 \cdot 1 - 3 \cdot 2\= -6$	$0$	$0$	$(-1, 2, -3)$
$3$	$(-1, 2, -3)$	$18$	$(3, 2, 3)$	$0$	$-1 \cdot 3 + 2 \cdot 2 - 3 \cdot 3\= -8$	$0$	$0$	$(-1, 2, -3)$
$3$	$(-1, 2, -3)$	$19$	$(4, 2, 4)$	$0$	$-1 \cdot 4 + 2 \cdot 2 - 3 \cdot 4\= -12$	$0$	$0$	$(-1, 2, -3)$
$3$	$(-1, 2, -3)$	$20$	$(3, 1, 2)$	$0$	$-1 \cdot 3 + 2 \cdot 1 - 3 \cdot 2\= -7$	$0$	$0$	$(-1, 2, -3)$
$3$	$(-1, 2, -3)$	$21$	$(1, 4, -1)$	$1$	$-1 \cdot 1 + 2 \cdot 4 + 3 \cdot 1\= 10$	$1$	$0$	$(-1, 2, -3)$
$3$	$(-1, 2, -3)$	$22$	$(2, 3, 0)$	$1$	$-1 \cdot 2 + 2 \cdot 3 - 3 \cdot 0\= 4$	$1$	$0$	$(-1, 2, -3)$
$3$	$(-1, 2, -3)$	$23$	$(1, 2, 0)$	$1$	$-1 \cdot 1 + 2 \cdot 2 - 3 \cdot 0\= 3$	$1$	$0$	$(-1, 2, -3)$
$3$	$(-1, 2, -3)$	$24$	$(0, 1, 0)$	$1$	$-1 \cdot 0 + 2 \cdot 1 - 3 \cdot 0\= 2$	$1$	$0$	$(-1, 2, -3)$

Again, the algorithm stops at the end of the third step because the readout weight vector $w ≝ (-1, 2, -3)$ hasn’t changed throughout this step: all the points have been correctly classified.

The decision boundary is the plane given by the equation:

\[\underbrace{w_0}_{≝ -1} x + \underbrace{w_1}_{≝ 2} y + \underbrace{w_2}_{≝ -3} z = 0 ⟺ z = \frac 2 3 y - \frac 1 3 x\]

*Figure 1.b.2.* - Plot of the resulting decision plane (in green) for the two 3D-data sets

c. Optional: Code the perceptron rule in a computer, and use the datasets of exercises a) and b) to check the validity of your answers. Please attach your code and the results obtained.

Here is a python implementation:

def perceptron_rule(X, Y):
    nb_trials, dimension = X.shape
    w = np.ones(X.shape[1])
    w_values = [np.copy(w)]

    while True:
        # is the readout vector still changing?
        # (to halt the loop, in case it isn't)
        still_changing = False

        for i in range(nb_trials):
            w += (Y[i]-int(w.dot(X[i])>=0))*X[i]

            if np.any(w_values[-1] != w):
                still_changing = True
            w_values.append(np.copy(w))

        if not still_changing:
            break

    # Returning all the readout weight vectors,
    # except those of the last step
    # (throughout which the readout vector didn't change)
    return np.array(w_values[:-nb_trials])

Here are the outputs:

for 2D-data sets:

  array([[ 1.,  1.],
     [ 0., -1.],
     [ 0., -1.],
     [ 0., -1.],
     [ 0., -1.],
     [ 0., -1.],
     [ 6.,  1.],
     [ 6.,  1.],
     [ 6.,  1.],
     [ 5., -1.],
     [ 4., -4.],
     [ 4., -4.],
     [ 4., -4.],
     [ 4., -4.],
     [ 4., -4.],
     [ 4., -4.],
     [ 4., -4.]])

which gives the following successive decision boundaries (the successive normalized readout vectors are plotted in purple) throughout the iterations of the main loop:

*Figure 1.c.1.* - Plot of the successive decision boundaries thoughout the algorithm (with their corresponding normalized readout vectors in purple)

for the 3D-one:

  array([[ 1.,  1.,  1.],
     [-1.,  0., -1.],
     [-1.,  0., -1.],
     [-1.,  0., -1.],
     [-1.,  0., -1.],
     [-1.,  0., -1.],
     [ 1.,  3., -1.],
     [ 1.,  3., -1.],
     [ 1.,  3., -1.],
     [-1.,  2., -3.],
     [-1.,  2., -3.],
     [-1.,  2., -3.],
     [-1.,  2., -3.],
     [-1.,  2., -3.],
     [-1.,  2., -3.],
     [-1.,  2., -3.],
     [-1.,  2., -3.]])

which gives the following successive decision planes throughout the iterations of the main loop:

which is exactly what we’ve computed by hand.

2. Hopfield networks

a. Consider a recurrent network of five binary neurons. Use the Hopfield rule to determine the synaptic weights of the network so that the pattern $ξ^\ast = (1, -1, -1, 1, -1) ∈ 𝔐_{1, 5}(ℝ)$ is memorized. Show explicitly that $ξ^\ast$ is a fixed point of the dynamics.

According to the Hopfield rule, the weights of the synaptic matrix $W ≝ (w_{ij})_{1 ≤ i, j ≤ 5}$ are to be set to:

\[w_{ij} ≝ \frac 1 5 ξ^\ast_i ξ^\ast_j\]

The resulting synaptic matrix is:

\[W ≝ \frac 1 5 {ξ^\ast}^\T ξ^\ast = \frac 1 5 \begin{pmatrix} 1 & -1 & -1 & 1 & -1 \\ -1 & 1 & 1 & -1 & 1 \\ -1 & 1 & 1 & -1 & 1 \\ 1 & -1 & -1 & 1 & -1 \\ -1 & 1 & 1 & -1 & 1 \\ \end{pmatrix}\]

As a matter of fact:

\[\begin{align*} \sign\left(W {ξ^\ast}^\T\right) & = \sign\left(\frac 1 5 \left({ξ^\ast}^\T ξ^\ast\right) {ξ^\ast}^\T\right) \\ & = \sign\left(\frac 1 5 {ξ^\ast}^\T \left(ξ^\ast {ξ^\ast}^\T\right)\right) &&\text{(because the matrix product is associative)}\\ &= \sign\Bigg(\frac 1 5 {ξ^\ast}^\T \overbrace{\Bigg(\sum\limits_{ i=1 }^5 \underbrace{(ξ^\ast_i)^2}_{= \; 1}\Bigg)}^{= \; 5}\Bigg) \\ &= \sign\left({ξ^\ast}^\T\right) \\ &= {ξ^\ast}^\T &&\text{(because } ξ^\ast ∈ \lbrace -1, 1 \rbrace^5 \text{ )} \end{align*}\]

So ${ξ^\ast}^\T$ ($ξ^\ast$ by abuse of notation) is indeed a fixed point of the dynamics.

b. What is the output of the network if the dynamics start from the initial state $ξ = (1, -1, -1, 1, -1)$?

As shown before, if the dynamics start from the initial state $ξ = (1, -1, -1, 1, -1)$, the ouput is the same state $(1, -1, -1, 1, -1)$ as it is a fixed point.

Based on the teacher’s remark in class, I think the question was rather meant to be about the initial state being $ξ ≝ (-1, 1, 1, -1, 1) = - ξ^\ast$

In this case, retrieving last computation:

\[\begin{align*} \sign\left(W ξ^\T\right) & = \sign\left(\frac 1 5 \left({ξ^\ast}^\T ξ^\ast\right) ξ^\T\right) \\ & = \sign\left(\frac 1 5 \left({ξ^\ast}^\T ξ^\ast\right) (-ξ^\ast)^\T\right) \\ & = \sign\left(- \frac 1 5 \left({ξ^\ast}^\T ξ^\ast\right) {ξ^\ast}^\T\right) \\ & = - \; \sign\Bigg(\frac 1 5 {ξ^\ast}^\T \underbrace{\left(ξ^\ast {ξ^\ast}^\T\right)}_{= \; 5}\Bigg) &&\text{(because } \sign \text{ is an odd function)}\\ &= - \; \underbrace{\sign\left({ξ^\ast}^\T\right)}_{\rlap{= \; {ξ^\ast}^\T \text{ (cf. the previous question)}}} \\ &= - {ξ^\ast}^\T \\ &= ξ^\T &&\text{(by definition of } ξ \text{ )} \end{align*}\]

As a result: $ξ^\T ≝ -{ξ^\ast}^\T$ ($ξ$ by abuse of notation) is also a fixed point of the dynamics.

c. Using the Hopfield rule, modify the synaptic weights so that $η^\ast = (1, 1, -1, -1, -1)$ is also memorized. Verify that both $ξ^\ast$ and $η^\ast$ are fixed points of the dynamics.

This, the Hopfield rule yields:

\[∀ i, j ∈ \lbrace 1, \ldots, 5 \rbrace, \quad w_{ij} ≝ \frac 1 5 (ξ^\ast_i ξ^\ast_j + η^\ast_i η^\ast_j)\]

The resulting synaptic matrix is:

\[W ≝ \frac 1 5 ({ξ^\ast}^\T ξ^\ast + {η^\ast}^\T η^\ast) = \frac 2 5 \begin{pmatrix} 1 & 0 & -1 & 0 & -1 \\ 0 & 1 & 0 & -1 & 0 \\ -1 & 0 & 1 & 0 & 1 \\ 0 & -1 & 0 & 1 & 0 \\ -1 & 0 & 1 & 0 & 1 \\ \end{pmatrix}\]

As a matter of fact, now:

\[\begin{align*} \sign\left(W {ξ^\ast}^\T\right) & = \sign\left(\frac 1 5 \left({ξ^\ast}^\T ξ^\ast + {η^\ast}^\T η^\ast\right) {ξ^\ast}^\T\right) \\ & = \sign\Big(\underbrace{\frac 1 5 \left({ξ^\ast}^\T ξ^\ast\right) {ξ^\ast}^\T}_{\rlap{= \; {ξ^\ast}^\T \; \text{(cf. previous questions)}}} + \frac 1 5 \left({η^\ast}^\T η^\ast\right) {ξ^\ast}^\T\Big)\\ & = \sign\Big({ξ^\ast}^\T + \frac 1 5 \, {η^\ast}^\T \underbrace{\left(η^\ast {ξ^\ast}^\T\right)}_{\rlap{= \; ⟨η^\ast,\, ξ^\ast⟩ \, = \, \sum\limits_{ i = 1}^5 η^\ast_i ξ^\ast_i \, = \, 1}}\Big)\\ &= \sign\Big({ξ^\ast}^\T + \frac 1 5 {η^\ast}^\T\Big) \\ &\overset{⊛}{=} \sign\left({ξ^\ast}^\T\right) &&\text{(because } \frac 1 5 η^\ast ∈ \left\lbrace - \frac 1 5, \frac 1 5 \right\rbrace^5 \text{: it can't change the sign of } ξ^\ast ∈ \lbrace -1, 1 \rbrace^5 \text{ )} \\ &= {ξ^\ast}^\T \end{align*}\]

And likewise:

\[\begin{align*} \sign\left(W {η^\ast}^\T\right) & = \sign\left(\frac 1 5 \left({ξ^\ast}^\T ξ^\ast + {η^\ast}^\T η^\ast\right) {η^\ast}^\T\right) \\ & = \sign\Big(\frac 1 5 \left({ξ^\ast}^\T ξ^\ast\right) {η^\ast}^\T + \frac 1 5 \left({η^\ast}^\T η^\ast\right) {η^\ast}^\T\Big)\\ & = \sign\Big(\frac 1 5 {ξ^\ast}^\T \underbrace{\left(ξ^\ast {η^\ast}^\T\right)}_{= \, ⟨ξ^\ast, \, η^\ast⟩ \, = \, 1} + \frac 1 5 {η^\ast}^\T \underbrace{\left(η^\ast {η^\ast}^\T\right)}_{\rlap{= \, ⟨η^\ast, η^\ast⟩ \, = \, \sum\limits_{ i=1 }^5 (η^\ast_i)^2 \, = \, 5}}\Big)\\ & = \sign\Big(\frac 1 5 \, {ξ^\ast}^\T + {η^\ast}^\T \Big)\\ &\overset{⊛}{=} \sign\left({η^\ast}^\T\right) &&\text{(because } \frac 1 5 ξ^\ast ∈ \left\lbrace - \frac 1 5, \frac 1 5 \right\rbrace^5 \text{: it can't change the sign of } η^\ast ∈ \lbrace -1, 1 \rbrace^5 \text{ )} \\ &= {η^\ast}^\T \end{align*}\]

So ${ξ^\ast}^\T$ and ${η^\ast}^\T$ ($ξ^\ast$ and $η^\ast$ by abuse of notation) are both fixed points of the dynamics.

NB: in the computations, we see that the equality $⊛$ is crucial to have ${ξ^\ast}^\T$ and ${η^\ast}^\T$ be fixed points. This stems from the fact that

\[{ξ^\ast}^\T + \frac 1 5 \, {η^\ast}^\T ⟨η^\ast, ξ^\ast⟩ \quad \text{(resp. } {η^\ast}^\T + \frac 1 5 \, {ξ^\ast}^\T ⟨η^\ast, ξ^\ast⟩ \text{ )}\]

has the same sign as ${ξ^\ast}^\T$ (resp. ${η^\ast}^\T$), which comes itself from the fact that $\vert ⟨η^\ast, ξ^\ast⟩ \vert < 5$ (note that $\vert ⟨η^\ast, ξ^\ast⟩ \vert ≤ 5$ is a consequence of $η^\ast, ξ^\ast ∈ \lbrace -1, 1 \rbrace^5$, but the latter condition is not enough not ensure that $\vert ⟨η^\ast, ξ^\ast⟩ \vert < 5$).

So a sufficient condition for any $ξ, η ∈ \lbrace -1, 1 \rbrace^N$ to be fixed points of a dynamics for which the synaptic matrix follows the Hopfield rule is:
\[\vert \underbrace{⟨η, ξ⟩}_{≝ \; \sum\limits_{ i=1 }^N η_i ξ_i } \vert < N\]
i.e. (as $ξ, η ∈ \lbrace -1, 1 \rbrace^N$):
\[\vert ⟨η, ξ⟩ \vert ≠ N\]

d. What is the output of the network if the dynamics starts from $ξ= (1, -1, 1, 1, 1)$?

We can check that

\[\sign\left(W ξ^\T\right) = - η^\ast\]

which is a fixed point of dynamics, as $η^\ast$ is a fixed point and inverse patterns are stored as well.

So in one step, the fixed point $- η^\ast$ is reached when strating with the state $ξ= (1, -1, 1, 1, 1)$.

The following code returns the markdown table of the states reached step by step for a starting state state:

N = 5
xi_ast = np.array([1., -1, -1, 1, -1])
eta_ast = np.array([1., 1, -1, -1, -1])

W = (np.outer(xi_ast, xi_ast) + np.outer(eta_ast, eta_ast))/N

def dynamics(state):
    return np.sign(W.dot(state))

fixed_point_reached = False
state = np.array([1., -1, 1, 1, 1])

table = '''
Step|State|
:--:|:--:|
0|{}
'''.format(state)

step = 1

while not fixed_point_reached:
    new_state = dynamics(state)

    if (new_state == state).all():
        fixed_point_reached = True

    table += "{}|{}\n".format(step, new_state)

    state = new_state
    step += 1

print(table)

# To display the properly formatted table in a Jupyter notebook for instance:
display(Markdown(table))

With the starting state state = np.array([1., -1, 1, 1, 1]), it gives:

Step	State
0	[ 1. -1. 1. 1. 1.]
1	[-1. -1. 1. 1. 1.]
2	$\underbrace{[-1. \, -1. \, 1. \, 1. \, 1.]}_{= \, -η^\ast}$

Share on

Twitter Facebook Google+ LinkedIn