Tutorial 2: Learning with structure

Tutorial 2: Learning with structure#

Week 2, Day 2: Neuro-Symbolic Methods

By Neuromatch Academy

Content creators: P. Michael Furlong, Chris Eliasmith

Content reviewers: Hlib Solodzhuk, Patrick Mineault, Aakash Agrawal, Alish Dipani, Hossein Rezaei, Yousef Ghanbari, Mostafa Abdollahi

Production editors: Konstantine Tsafatinos, Ella Batty, Spiros Chavlis, Samuele Bolotta, Hlib Solodzhuk

Tutorial Objectives#

Estimated timing of tutorial: 50 minutes

This tutorial will present you with a couple of play-examples on the usage of basic operations of vector symbolic algebras while generalizing to the new knowledge.

Setup#

Install and import feedback gadget#

Notice that exactly the neuromatch branch of sspspace should be installed! Otherwise, some of the functionality (like optimize parameter in the DiscreteSPSpace initialization) won’t work.

Install dependencies#

Imports#

Figure settings#

Plotting functions#

Show code cell source Hide code cell source

# @title Plotting functions

def plot_similarity_matrix(sim_mat, labels, values = False):
    """
    Plot the similarity matrix between vectors.

    Inputs:
    - sim_mat (numpy.ndarray): similarity matrix between vectors.
    - labels (list of str): list of strings which represent concepts.
    - values (bool): True if we would like to plot values of similarity too.
    """
    with plt.xkcd():
        plt.imshow(sim_mat, cmap='Greys')
        plt.colorbar()
        plt.xticks(np.arange(len(labels)), labels, rotation=45, ha="right", rotation_mode="anchor")
        plt.yticks(np.arange(len(labels)), labels)
        if values:
            for x in range(sim_mat.shape[1]):
                for y in range(sim_mat.shape[0]):
                    plt.text(x, y, f"{sim_mat[y, x]:.2f}", fontsize = 8, ha="center", va="center", color="green")
        plt.title('Similarity between vector-symbols')
        plt.xlabel('Symbols')
        plt.ylabel('Symbols')
        plt.show()

def plot_training_and_choice(losses, sims, ant_names, cons_names, action_names):
    """
    Plot loss progression over training as well as predicted similarities for given rules / correct solutions.

    Inputs:
    - losses (list): list of loss values.
    - sims (list): list of similartiy matrices.
    - ant_names (list): list of antecedance names.
    - cons_names (list): list of consequent names.
    - action_names (list): full list of concepts.
    """
    with plt.xkcd():
        plt.subplot(1, len(ant_names) + 1, 1)
        plt.plot(losses)
        plt.xlabel('Training number')
        plt.ylabel('Loss')
        plt.title('Training Error')
        index = 1
        for ant_name, cons_name, sim in zip(ant_names, cons_names, sims):
            index += 1
            plt.subplot(1, len(ant_names) + 1, index)
            plt.bar(range(len(action_names)), sim.flatten())
            plt.gca().set_xticks(range(len(action_names)))
            plt.gca().set_xticklabels(action_names, rotation=90)
            plt.title(f'{ant_name}, not*{cons_name}')

def plot_choice(sims, ant_names, cons_names, action_names):
    """
    Plot predicted similarities for given rules / correct solutions.
    """
    with plt.xkcd():
        index = 0
        for ant_name, cons_name, sim in zip(ant_names, cons_names, sims):
            index += 1
            plt.subplot(1, len(ant_names) + 1, index)
            plt.bar(range(len(action_names)), sim.flatten())
            plt.gca().set_xticks(range(len(action_names)))
            plt.gca().set_xticklabels(action_names, rotation=90)
            plt.ylabel("Similarity")
            plt.title(f'{ant_name}, not*{cons_name}')

Set random seed#

Helper functions#

Show code cell source Hide code cell source

# @title Helper functions

set_seed(42)

symbol_names = ['monarch','heir','male','female']
discrete_space = sspspace.DiscreteSPSpace(symbol_names, ssp_dim=1024, optimize=False)

objs = {n:discrete_space.encode(n) for n in symbol_names}

objs['king'] = objs['monarch'] * objs['male']
objs['queen'] = objs['monarch'] * objs['female']
objs['prince'] = objs['heir'] * objs['male']
objs['princess'] = objs['heir'] * objs['female']

bundle_objs = {n:discrete_space.encode(n) for n in symbol_names}

bundle_objs['king'] = (bundle_objs['monarch'] + bundle_objs['male']).normalize()
bundle_objs['queen'] = (bundle_objs['monarch'] + bundle_objs['female']).normalize()
bundle_objs['prince'] = (bundle_objs['heir'] + bundle_objs['male']).normalize()
bundle_objs['princess'] = (bundle_objs['heir'] + bundle_objs['female']).normalize()

bundle_objs['princess_query'] = (bundle_objs['prince'] - bundle_objs['king']) + bundle_objs['queen']

new_symbol_names = ['dollar', 'peso', 'ottawa', 'mexico-city', 'currency', 'capital']
new_discrete_space = sspspace.DiscreteSPSpace(new_symbol_names, ssp_dim=1024, optimize=False)

new_objs = {n:new_discrete_space.encode(n) for n in new_symbol_names}

cleanup = sspspace.Cleanup(new_objs)

new_objs['canada'] = ((new_objs['currency'] * new_objs['dollar']) + (new_objs['capital'] * new_objs['ottawa'])).normalize()
new_objs['mexico'] = ((new_objs['currency'] * new_objs['peso']) + (new_objs['capital'] * new_objs['mexico-city'])).normalize()

card_states = ['red','blue','odd','even','not','green','prime','implies','ant','relation','cons']
encoder = sspspace.DiscreteSPSpace(card_states, ssp_dim=1024, optimize=False)
vocab = {c:encoder.encode(c) for c in card_states}

for a in ['red','blue','odd','even','green','prime']:
    vocab[f'not*{a}'] = vocab['not'] * vocab[a]

action_names = ['red','blue','odd','even','green','prime','not*red','not*blue','not*odd','not*even','not*green','not*prime']
action_space = np.array([vocab[x] for x in action_names]).squeeze()

rules = [
    (vocab['ant'] * vocab['blue'] + vocab['relation'] * vocab['implies'] + vocab['cons'] * vocab['even']).normalize(),
    (vocab['ant'] * vocab['odd'] + vocab['relation'] * vocab['implies'] + vocab['cons'] * vocab['green']).normalize(),
]

Section 1: Analogies. Part 1#

In this section we will construct a simple analogy using Vector Symbolic Algebras. The question we are going to try and solve is “King is to the queen as the prince is to X.”

Video 1: Analogy 1#

Submit your feedback#

Coding Exercise 1: Royal Relationships#

We’re going to start by considering our vocabulary. We will use the basic discrete concepts of monarch, heir, male, and female.

Let’s create the objects we know about by combinatorially expanding the space:

King is a male monarch
Queen is a female monarch
Prince is a male heir
Princess is a female heir

Complete the missing parts of the code to obtain correct representations of new concepts.

###################################################################
## Fill out the following then remove
raise NotImplementedError("Student exercise: complete correct relations for creating new concepts.")
###################################################################

set_seed(42)

symbol_names = ['monarch','heir','male','female']
discrete_space = sspspace.DiscreteSPSpace(symbol_names, ssp_dim=1024, optimize=False)

objs = {n:discrete_space.encode(n) for n in symbol_names}

objs['king'] = objs['monarch'] * objs['male']
objs['queen'] = objs['monarch'] * ...
objs['prince'] = objs['heir'] * objs['male']
objs['princess'] = ... * objs['female']

Click for solution

Now, we can take an explicit approach. We know that the conversion from king to queen is to unbind male and bind female, so let’s apply that to our prince object and see what we uncover.

At first, in the cell below, let’s recover queen from king by constructing a new query concept, which represents the unbinding of male and the binding of female. Then, let’s see if this new query object bears any similarity to anything in our vocabulary.

###################################################################
## Fill out the following then remove
raise NotImplementedError("Student exercise: complete correct relation for creating `query` object to compare with `queen`.")
###################################################################

objs['queen_query'] = (objs[...] * ~objs[...]) * objs[...]

object_names = list(objs.keys())
sims = np.zeros((len(object_names), len(object_names)))

for name_idx, name in enumerate(object_names):
    for other_idx in range(name_idx, len(object_names)):
        sims[name_idx, other_idx] = sims[other_idx, name_idx] = (objs[name] | objs[object_names[other_idx]]).item()

plot_similarity_matrix(sims, object_names, values = True)

Click for solution

Example output:

The above similarity plot shows that applying that operation successfully converts king to queen. Let’s apply it to ‘prince’ and see what happens. Now, query should represent the princess concept.

objs['princess_query'] = (objs['prince'] * ~objs['male']) * objs['female']

object_names = list(objs.keys())
sims = np.zeros((len(object_names), len(object_names)))

for name_idx, name in enumerate(object_names):
    for other_idx in range(name_idx, len(object_names)):
        sims[name_idx, other_idx] = sims[other_idx, name_idx] = (objs[name] | objs[object_names[other_idx]]).item()

plot_similarity_matrix(sims, object_names, values = True)

../../../_images/15c7a3588a8bcc8a9257de3c87cc9ce4d3e7e196a9daacd48a265074e9f8c0f5.png

Here, we have successfully recovered the princess, completing the analogy.

This approach, however, requires explicit knowledge of the construction of the objects. Let’s see if we can just work with the concepts of ‘king,’ ‘queen,’ and ‘prince’ directly.

In the cell below, construct the princess concept using only king, queen, and prince.

###################################################################
## Fill out the following then remove
raise NotImplementedError("Student exercise: complete correct relation for creating `query` object to compare with `princess`.")
###################################################################

objs['new_princess_query'] = (objs[...] * ~objs[...]) * objs[...]

sims = np.zeros((len(object_names), len(object_names)))

for name_idx, name in enumerate(object_names):
    for other_idx in range(name_idx, len(object_names)):
        sims[name_idx, other_idx] = sims[other_idx, name_idx] = (objs[name] | objs[object_names[other_idx]]).item()

plot_similarity_matrix(sims, object_names, values = True)

Click for solution

Example output:

Again, we see that we have recovered the princess by using our analogy.

That said, the above depends on knowing that the representations are constructed using binding. Can we do something similar through the bundling operation? Let’s try that out.

Reassing concept definitions using bundling operation.

set_seed(42)

bundle_objs = {n:discrete_space.encode(n) for n in symbol_names}

bundle_objs['king'] = (bundle_objs['monarch'] + bundle_objs['male']).normalize()
bundle_objs['queen'] = (bundle_objs['monarch'] + bundle_objs['female']).normalize()
bundle_objs['prince'] = (bundle_objs['heir'] + bundle_objs['male']).normalize()
bundle_objs['princess'] = (bundle_objs['heir'] + bundle_objs['female']).normalize()

But now that we are using an additive model, we need to take a different approach. Instead of unbinding the king and binding the queen, we subtract the king and add the queen to find the princess from the prince.

Complete the code to reflect the updated mechanism.

###################################################################
## Fill out the following then remove
raise NotImplementedError("Student exercise: complete correct relation for creating `query` object to compare with `princess`.")
###################################################################

bundle_objs['princess_query'] = (bundle_objs[...] - bundle_objs[...]) + bundle_objs[...]

Click for solution

object_names = list(bundle_objs.keys())

sims = np.zeros((len(object_names), len(object_names)))

for name_idx, name in enumerate(object_names):
    for other_idx in range(name_idx, len(object_names)):
        sims[name_idx, other_idx] = sims[other_idx, name_idx] = (bundle_objs[name] | bundle_objs[object_names[other_idx]]).item()

plot_similarity_matrix(sims, object_names, values = True)

../../../_images/1f12dcbd376834c41479440a2f6acb51b9bd4982b923f60b9bf0c242ab0a0650.png

This is a messier similarity plot due to the fact that the bundled representations interact with all their constituent parts in the vocabulary. That said, we see that ‘princess’ is still most similar to the query vector.

This approach is more like what we would expect from a word2vec embedding.

Submit your feedback#

Section 2: Analogies. Part 2#

Estimated timing to here from start of tutorial: 15 minutes

In this section, we will construct a database of data structures that describe different countries. Materials are adopted from Kanerva (2010).

Video 2: Analogy 2#

Submit your feedback#

Coding Exercise 2: Dollar of Mexico#

This is going to be a little more involved because to construct the data structure, we are going to need vectors that not only represent values that we are reasoning about, but also vectors that represent different roles data can play. This is sometimes called a slot-filler representation or a key-value representation.

At first, let us define concepts and cleanup object. Then, we will define canada and mexico concepts by integrating the available information together. You will be provided with a canada object and your task is to complete for mexico one. Note that:

We bind currency to the relevant currency for that country (dollar for Canada, peso for Mexico)
We bind capital to the relevant capital for that country (Ottawa for Canada, Mexico City for Mexico)

###################################################################
## Fill out the following then remove
raise NotImplementedError("Student exercise: complete `mexico` concept.")
###################################################################

set_seed(42)

new_symbol_names = ['dollar', 'peso', 'ottawa', 'mexico-city', 'currency', 'capital']
new_discrete_space = sspspace.DiscreteSPSpace(new_symbol_names, ssp_dim=1024, optimize=False)

new_objs = {n:new_discrete_space.encode(n) for n in new_symbol_names}

cleanup = sspspace.Cleanup(new_objs)

new_objs['canada'] = ((new_objs['currency'] * new_objs['dollar']) + (new_objs['capital'] * new_objs['ottawa'])).normalize()
new_objs['mexico'] = ((new_objs['currency'] * ...) + (new_objs['capital'] * ...)).normalize()

Click for solution

We would like to find out Mexico’s currency. Complete the code for constructing a query which will help us do that. Note that we are using a cleanup operation (feel free to remove it and compare the results).

###################################################################
## Fill out the following then remove
raise NotImplementedError("Student exercise: complete `query` concept which will be similar to currency in Mexico.")
###################################################################

objs['peso_query'] = cleanup(~new_objs[...] * new_objs[...] * new_objs['mexico'])

object_names = list(new_objs.keys())
sims = np.zeros((len(object_names), len(object_names)))

for name_idx, name in enumerate(object_names):
    for other_idx in range(name_idx, len(object_names)):
        sims[name_idx, other_idx] = sims[other_idx, name_idx] = (new_objs[name] | new_objs[object_names[other_idx]]).item()

plot_similarity_matrix(sims, object_names)

Click for solution

Example output:

After cleanup, the query vector is the most similar to the peso object in the vocabulary, correctly answering the question.

Note, however, that the similarity is not perfectly equal to 1. This is due to the scale factors applied to the composite vectors canada and mexico, to ensure they remain unit vectors, and due to cross talk. Crosstalk is a symptom of the fact that we are binding and unbinding bundles of vector symbols to produce the resultant query vector. The constituent vectors are not perfectly orthogonal (i.e., having a dot product of zero), and as such, the terms in the bundle interact when we measure similarity between them.

Submit your feedback#

Section 3: Wason Card Task#

Estimated timing to here from start of tutorial: 25 minutes

One of the powerful benefits of using these structured representations is being able to generalize to other circumstances. To demonstrate this, we are going to show you this in a simple task.

Video 3: Wason Card Task Intro#

Submit your feedback#

Coding Exercise 3: Wason Card Task#

We are going to test the generalization property on the Wason Card Task, where a person is told a rule of the form “if the card is even, then the back is blue”, they are then presented with a number of cards with either an odd number, an even number, a red back, or a blue back. The participant is asked which cards they have to flip to determine that the rule is true.

In this case, the participant needs to flip only the even card(s), and any card where the back is not blue, as the rule does not state whether or not odd numbers can have blue backs, and a red-backed card with an even number would violate the rule. We can get this from Boolean logic:

\[ \mathrm{even} \implies \mathrm{blue} \]

which is equal to

\[ \neg \mathrm{even} \vee \mathrm{blue} \]

where \(\neg\) means a logical not and \(\vee\) means logical or. If we want to find cards that violate the rule, then we negate the rule, providing:

\[ \neg (\neg \mathrm{even} \vee \mathrm{blue}) = \mathrm{even} \wedge \neg \mathrm{blue}. \]

Where \(\wedge\) is the logical and. Hence, the cards that can violate the rule are either even or not blue.

At first, we will define all the needed concepts. For all noun concepts, we would also like to have not concept presented in the space; please complete missing code parts.

set_seed(42)

card_states = ['red','blue','odd','even','not','green','prime','implies','ant','relation','cons']
encoder = sspspace.DiscreteSPSpace(card_states, ssp_dim=1024, optimize=False)
vocab = {c:encoder.encode(c) for c in card_states}

###################################################################
## Fill out the following then remove
raise NotImplementedError("Student exercise: complete creating `not x` concepts.")
###################################################################

for a in ['red','blue','odd','even','green','prime']:
    vocab[f'not*{a}'] = vocab[...] * vocab[a]

action_names = ['red','blue','odd','even','green','prime','not*red','not*blue','not*odd','not*even','not*green','not*prime']
action_space = np.array([vocab[x] for x in action_names]).squeeze()

Click for solution

Now, we are going to set up a simple perceptron-style learning rule using the HRR (Holographic Reduced Representations) algebra. We are going to learn a target transformation, \(T\), such that given a learning rule, \(A^{*} = T\circledast R\). Here:

\(R\) is the rule to be learned
\(A^{*}\) is the antecedent value bundled with \(\texttt{not}\) bound with the consequent value. This is because we are trying to learn the cards that can violate the rule.

Rules themselves are going to be composed like the data structures representing different countries in the previous section. ant, relation, and cons are extra concepts that define the structure and which will bind to the specific instances (think of them as anchor concepts which got bound to the specific instances).

If we have a rule, \(X \implies Y\), then we would create the VSA representation:

\[ R = \texttt{ant} \circledast X + \texttt{relation} \circledast \texttt{implies} + \texttt{cons} \circledast Y \]

and the ideal output is:

\[ A^{*} = X + \texttt{not}\circledast Y \]

In the cell below, let us define two rules:

\[\text{blue} \implies \text{even}\]

\[\text{odd} \implies \text{green}\]

###################################################################
## Fill out the following then remove
raise NotImplementedError("Student exercise: complete creating rules as defined above.")
###################################################################

rules = [
    (vocab['ant'] * vocab['blue'] + vocab['relation'] * vocab['implies'] + vocab['cons'] * vocab[...]).normalize(),
    (vocab[...] * vocab[...] + vocab[...] * vocab[...] + vocab[...] * vocab[...]).normalize(),
]

Click for solution

Now, we are ready to derive the transformation! For that, we will iterate through the rules and solutions for a specified number of iterations and update it as the following:

\[ T \leftarrow T - \text{lr} (T - A^{} \circledast \sim R)\]

where \(\text{lr}\) is learning rate constant value. Ultimately, we want \(A^{*} = T\circledast R\), so we unbind \(R\) to recover the desired transform and use the learning rule to update our current estimated transform.

We will also compute loss progression over time and log the loss function between perfect similarity (ones only for antecedence value and not consequent one) and the one we obtain between prediction for current transformation and full action space. Complete the missing parts of the code in the next cell to complete training.

###################################################################
## Fill out the following then remove
raise NotImplementedError("Student exercise: complete training loop.")
###################################################################

num_iters = 500
losses = []
sims = []
lr = 1e-1
ant_names = ["blue", "odd"]
cons_names = ["even", "green"]

transform = np.zeros((1,encoder.ssp_dim))
for i in range(num_iters):
    loss = 0
    for rule, ant_name, cons_name in zip(rules, ant_names, cons_names):

        #perfect similarity
        y_true = np.eye(len(action_names))[action_names.index(ant_name),:] + np.eye(len(action_names))[4+action_names.index(cons_name),:]

        #prediction with current transform (a_hat = transform * rule)
        a_hat = sspspace.SSP(transform) * ...

        #similarity with current transform
        sim_mat = action_space @ a_hat.T

        #cleanup
        y_hat = softmax(sim_mat)

        #true solution (a* = ant_name + not * cons_name)
        a_true = (vocab[ant_name] + vocab['not']*vocab[...]).normalize()

        #calculate loss
        loss += log_loss(y_true, y_hat)

        #update transform (T <- T - lr * (T - A* * (~rule)))
        transform -= (lr) * (... - np.array(... * ~...))
        transform = transform / np.linalg.norm(transform)

        #save predicted similarities if it is last iteration
        if i == num_iters - 1:
            sims.append(sim_mat)

    #save loss
    losses.append(np.copy(loss))

plot_training_and_choice(losses, sims, ant_names, cons_names, action_names)

Click for solution

Example output:

Let’s see what happens when we test it on a new rule it hasn’t seen before. This time, we will use the rule that \(\text{red} \implies \text{prime}\). Your task is to complete the new rule in the cell below and observe the results.

###################################################################
## Fill out the following then remove
raise NotImplementedError("Student exercise: complete new rule and predict for it.")
###################################################################

new_rule = (vocab['ant'] * vocab[...] + vocab['relation'] * ... + vocab['cons'] * vocab[...]).normalize()

#apply transform on new rule to test the generalization of the transform
a_hat = sspspace.SSP(transform) * ...

new_sims = action_space @ a_hat.T
y_hat = softmax(new_sims)

plot_choice([new_sims], ["red"], ["prime"], action_names)

Click for solution

Example output:

Let’s compare how a standard MLP that isn’t aware of the structure in the representation performs. Here, features are going to be the rules and output - solutions. Complete the code below.

###################################################################
## Fill out the following then remove
raise NotImplementedError("Student exercise: complete MLP training.")
###################################################################

#features - rules
X_train = np.array(...).squeeze()

#output - a* for each rule
y_train = np.array([
    (vocab[ant_names[0]] + vocab['not']*vocab[cons_names[0]]).normalize(),
    (vocab[ant_names[1]] + vocab['not']*vocab[cons_names[1]]).normalize(),
]).squeeze()

regr = MLPRegressor(random_state=1, hidden_layer_sizes=(1024,1024), max_iter=1000).fit(..., ...)

a_mlp = regr.predict(new_rule)

mlp_sims = action_space @ a_mlp.T

plot_choice([mlp_sims], ["red"], ["prime"], action_names)

Click for solution

Example output:

As you can see, this model, even though it is a more expressive neural network, simply learns to predict the values it had seen before when presented with a novel stimulus.

Submit your feedback#

Video 4: Wason Card Task Outro#

Submit your feedback#

Summary#

Estimated timing of tutorial: 45 minutes

In this tutorial, we observed three scenarios where we used the basic operations to solve different analogies and engage in structured learning. The next final tutorial will show us how to use structure to impose inductive biases and how to use continuous representations to represent mixtures of discrete and continuous objects.

Tutorial 2: Learning with structure

Contents

Tutorial 2: Learning with structure#

Tutorial Objectives#

Setup#

Install and import feedback gadget#

Install dependencies#

Imports#

Figure settings#

Plotting functions#

Set random seed#

Helper functions#

Section 1: Analogies. Part 1#

Video 1: Analogy 1#

Submit your feedback#

Coding Exercise 1: Royal Relationships#

Submit your feedback#

Section 2: Analogies. Part 2#

Video 2: Analogy 2#

Submit your feedback#

Coding Exercise 2: Dollar of Mexico#

Submit your feedback#

Section 3: Wason Card Task#

Video 3: Wason Card Task Intro#

Submit your feedback#

Coding Exercise 3: Wason Card Task#

Submit your feedback#

Video 4: Wason Card Task Outro#

Submit your feedback#

Summary#