{ "cells": [ { "cell_type": "markdown", "metadata": { "colab_type": "text", "execution": {}, "id": "view-in-github" }, "source": [ "\"Open   \"Open" ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "# Tutorial 2: Continual learning\n", "\n", "**Week 2, Day 4: Macro-Learning**\n", "\n", "**By Neuromatch Academy**\n", "\n", "__Content creators:__ Hlib Solodzhuk, Ximeng Mao, Grace Lindsay\n", "\n", "__Content reviewers:__ Aakash Agrawal, Alish Dipani, Hossein Rezaei, Yousef Ghanbari, Mostafa Abdollahi, Hlib Solodzhuk, Ximeng Mao, Samuele Bolotta, Grace Lindsay\n", "\n", "__Production editors:__ Konstantine Tsafatinos, Ella Batty, Spiros Chavlis, Samuele Bolotta, Hlib Solodzhuk\n" ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "___\n", "\n", "\n", "# Tutorial Objectives\n", "\n", "*Estimated timing of tutorial: 25 minutes*\n", "\n", "In this tutorial, you will observe how further training on new data or tasks causes forgetting of past tasks. You will also explore how different learning schedules impact performance." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "execution": {}, "tags": [ "remove-input" ] }, "outputs": [], "source": [ "# @markdown\n", "from IPython.display import IFrame\n", "from ipywidgets import widgets\n", "out = widgets.Output()\n", "with out:\n", " print(f\"If you want to download the slides: https://osf.io/download/t36w8/\")\n", " display(IFrame(src=f\"https://mfr.ca-1.osf.io/render?url=https://osf.io/t36w8/?direct%26mode=render%26action=download%26mode=render\", width=730, height=410))\n", "display(out)" ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "---\n", "# Setup\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Install and import feedback gadget\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "execution": {}, "tags": [ "hide-input" ] }, "outputs": [], "source": [ "# @title Install and import feedback gadget\n", "\n", "!pip install vibecheck datatops tqdm numpy matplotlib ipywidgets scikit-learn --quiet\n", "\n", "from vibecheck import DatatopsContentReviewContainer\n", "def content_review(notebook_section: str):\n", " return DatatopsContentReviewContainer(\n", " \"\", # No text prompt\n", " notebook_section,\n", " {\n", " \"url\": \"https://pmyvdlilci.execute-api.us-east-1.amazonaws.com/klab\",\n", " \"name\": \"neuromatch_neuroai\",\n", " \"user_key\": \"wb2cxze8\",\n", " },\n", " ).render()\n", "\n", "\n", "feedback_prefix = \"W2D4_T2\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "execution": {}, "tags": [ "hide-input" ] }, "outputs": [], "source": [ "# @title Imports\n", "\n", "# working with data\n", "import numpy as np\n", "\n", "# plotting\n", "import matplotlib.pyplot as plt\n", "import logging\n", "from tqdm import tqdm\n", "\n", "# interactive display\n", "import ipywidgets as widgets\n", "\n", "# modeling\n", "from sklearn.neural_network import MLPRegressor\n", "from sklearn.model_selection import train_test_split" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Figure settings\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "execution": {}, "tags": [ "hide-input" ] }, "outputs": [], "source": [ "# @title Figure settings\n", "\n", "logging.getLogger('matplotlib.font_manager').disabled = True\n", "\n", "%matplotlib inline\n", "%config InlineBackend.figure_format = 'retina' # perfrom high definition rendering for images and plots\n", "plt.style.use(\"https://raw.githubusercontent.com/NeuromatchAcademy/course-content/main/nma.mplstyle\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting functions\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "execution": {}, "tags": [ "hide-input" ] }, "outputs": [], "source": [ "# @title Plotting functions\n", "\n", "def plot_summer_autumn_predictions(summer_predictions, autumn_predictions):\n", " \"\"\"\n", " Plots the true data (summer and autumn prices) along with the predicted summer and autumn prices using a scatter plot.\n", "\n", " Inputs:\n", " - summer_predictions (numpy.ndarray): Array containing the predicted prices for the summer season.\n", " - autumn_predictions (numpy.ndarray): Array containing the predicted prices for the autumn season.\n", " \"\"\"\n", "\n", " with plt.xkcd():\n", " plt.plot(np.append(summer_days, autumn_days), np.append(summer_prices, autumn_prices), label = \"True Data\")\n", " plt.scatter(autumn_days_test, autumn_predictions, label = \"Autumn Predictions\", marker='o', color='g', zorder=2)\n", " plt.scatter(summer_days_test, summer_predictions, label = \"Summer Predictions\", marker='o', color='r', zorder=2)\n", " plt.xlabel('Week')\n", " plt.ylabel('Price')\n", " plt.legend()\n", " plt.show()\n", "\n", "def plot_performance(num_epochs, summer_r_squared, autumn_r_squared):\n", " \"\"\"\n", " Plots the R-squared values for the summer and autumn seasons during the training process.\n", "\n", " Inputs:\n", " - num_epochs (int): The number of training epochs.\n", " - summer_r_squared (list): List containing the R-squared values for the summer season at each epoch.\n", " - autumn_r_squared (list): List containing the R-squared values for the autumn season at each epoch.\n", " \"\"\"\n", "\n", " print(f\"Summmer final R-squared value is: {summer_r_squared[-1]:.02f}\")\n", " print(f\"Autumn final R-squared value is: {autumn_r_squared[-1]:.02f}\")\n", "\n", "\n", " with plt.xkcd():\n", " plt.plot(np.arange(num_epochs), summer_r_squared, label = \"Summer Fit\")\n", " plt.plot(np.arange(num_epochs), autumn_r_squared, label = \"Autumn Fit\")\n", " plt.xlabel('Epoch')\n", " plt.ylabel('R-squared value')\n", " plt.legend()\n", " plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set random seed\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "execution": {}, "tags": [ "hide-input" ] }, "outputs": [], "source": [ "# @title Set random seed\n", "\n", "import random\n", "import numpy as np\n", "\n", "def set_seed(seed=None):\n", " if seed is None:\n", " seed = np.random.choice(2 ** 32)\n", " random.seed(seed)\n", " np.random.seed(seed)\n", "\n", "set_seed(seed = 42)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Video 1: Continual learning\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "execution": {}, "tags": [ "remove-input" ] }, "outputs": [], "source": [ "# @title Video 1: Continual learning\n", "\n", "from ipywidgets import widgets\n", "from IPython.display import YouTubeVideo\n", "from IPython.display import IFrame\n", "from IPython.display import display\n", "\n", "class PlayVideo(IFrame):\n", " def __init__(self, id, source, page=1, width=400, height=300, **kwargs):\n", " self.id = id\n", " if source == 'Bilibili':\n", " src = f'https://player.bilibili.com/player.html?bvid={id}&page={page}'\n", " elif source == 'Osf':\n", " src = f'https://mfr.ca-1.osf.io/render?url=https://osf.io/download/{id}/?direct%26mode=render'\n", " super(PlayVideo, self).__init__(src, width, height, **kwargs)\n", "\n", "def display_videos(video_ids, W=400, H=300, fs=1):\n", " tab_contents = []\n", " for i, video_id in enumerate(video_ids):\n", " out = widgets.Output()\n", " with out:\n", " if video_ids[i][0] == 'Youtube':\n", " video = YouTubeVideo(id=video_ids[i][1], width=W,\n", " height=H, fs=fs, rel=0)\n", " print(f'Video available at https://youtube.com/watch?v={video.id}')\n", " else:\n", " video = PlayVideo(id=video_ids[i][1], source=video_ids[i][0], width=W,\n", " height=H, fs=fs, autoplay=False)\n", " if video_ids[i][0] == 'Bilibili':\n", " print(f'Video available at https://www.bilibili.com/video/{video.id}')\n", " elif video_ids[i][0] == 'Osf':\n", " print(f'Video available at https://osf.io/{video.id}')\n", " display(video)\n", " tab_contents.append(out)\n", " return tab_contents\n", "\n", "video_ids = [('Youtube', 'owfKdX7UMn8'), ('Bilibili', 'BV1W4421Q79N')]\n", "tab_contents = display_videos(video_ids, W=730, H=410)\n", "tabs = widgets.Tab()\n", "tabs.children = tab_contents\n", "for i in range(len(tab_contents)):\n", " tabs.set_title(i, video_ids[i][0])\n", "display(tabs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Submit your feedback\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "execution": {}, "tags": [ "hide-input" ] }, "outputs": [], "source": [ "# @title Submit your feedback\n", "content_review(f\"{feedback_prefix}_continual_learning\")" ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "---\n", "\n", "# Section 1: Catastrophic forgetting\n", "\n", "In this section, we will discuss the concept of continual learning and argue that training on new data does not guarantee that the model will remember the relationships on which it was trained earlier. \n", "\n", "- Catastrophic forgetting occurs when a neural network, or any machine learning model, forgets the information it previously learned upon learning new data. This is particularly problematic in scenarios where a model needs to perform well across multiple types of data or tasks that evolve over time.\n", "\n", "- Continual learning is an approach in machine learning that aims to mitigate the issues of catastrophic forgetting. The goal is to develop algorithms that can adapt to new data while preserving knowledge about the old data. This is essential for applications where the model must adapt to changes dynamically without losing the ability to perform tasks it was previously trained on." ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "## Coding Exercise 1: Fitting new data\n", "\n", "Let's assume now that we want our model to predict not only the summer prices but also the autumn ones. We have already trained the MLP to predict summer months effectively but observed that it performs poorly during the autumn period. Let's try to make the model learn new information about the prices and see whether it can remember both. First, we will need to retrain the model for this tutorial on summer days." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {} }, "outputs": [], "source": [ "#define variables\n", "A = .005\n", "B = 0.1\n", "phi = 0\n", "C = 1\n", "\n", "#define days (observe that those are not 1, ..., 365 but proxy ones to make model function neat)\n", "days = np.arange(-26, 26 + 1/7, 1/7)\n", "prices = A * days**2 + B * np.sin(np.pi * days + phi) + C\n", "\n", "#take only summer data for intro-training\n", "summer_days = np.expand_dims(days[151:243], 1)\n", "summer_prices = prices[151:243]\n", "\n", "#take autumn data for further training\n", "autumn_days = np.expand_dims(days[243:334], 1)\n", "autumn_prices = prices[243:334]\n", "\n", "#divide summer data into train and test sets\n", "summer_days_train, summer_days_test, summer_prices_train, summer_prices_test = train_test_split(summer_days, summer_prices, random_state = 42)\n", "\n", "#divide autumn data into train and test sets\n", "autumn_days_train, autumn_days_test, autumn_prices_train, autumn_prices_test = train_test_split(autumn_days, autumn_prices, random_state = 42)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {} }, "outputs": [], "source": [ "#apply normalization for days (we take parameters for whole summer-autumn period)\n", "days_mean, days_std = np.mean(days[151:334]), np.std(days[151:334])\n", "\n", "summer_days_train_norm = (summer_days_train - days_mean) / days_std\n", "summer_days_test_norm = (summer_days_test - days_mean) / days_std\n", "\n", "#notice that normalization is taken from summer parameters as obviously model is going to forget old data if we reassign it (by making normalization the same)\n", "autumn_days_train_norm = (autumn_days_train - days_mean) / days_std\n", "autumn_days_test_norm = (autumn_days_test - days_mean) / days_std\n", "\n", "#define MLP\n", "base_model = MLPRegressor(hidden_layer_sizes=(100, 100), max_iter=10000, random_state = 42, solver = \"lbfgs\") # LBFGS is better to use when there is small amount of data\n", "\n", "#train MLP\n", "base_model.fit(summer_days_train_norm, summer_prices_train)\n", "\n", "#evaluate MLP on test data\n", "print(f\"R-squared value is: {base_model.score(summer_days_test_norm, summer_prices_test):.02f}.\")" ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "Now, let's incrementally fit the autumn data to the trained model and monitor the R-squared values for the summer and autumn data test sets during each iteration. In the following code snippet, you are requested to complete further training using the `partial_fit` function, which allows us to train the existing model on new data (it runs through the data only once, and thus, we can control for the number of iterations). As we iterate through the epochs, we can append new R-squared values for each epoch. We will visually explore the performance of this model on both the summer and autumn datasets as well as R-squared values progression over epochs." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {} }, "outputs": [], "source": [ "###################################################################\n", "## Fill out the following then remove\n", "raise NotImplementedError(\"Student exercise: complete partial fit and calculate r-squared values for both summer and autumn data\")\n", "###################################################################\n", "\n", "# Initial r-squared calculations\n", "summer_r_squared = [base_model.score(summer_days_test_norm, summer_prices_test)]\n", "autumn_r_squared = [base_model.score(autumn_days_test_norm, autumn_prices_test)]\n", "num_epochs = 10\n", "\n", "# Progress bar integration with tqdm\n", "for _ in tqdm(range(num_epochs - 1), desc=\"Training Progress\"):\n", "\n", " # Fit new data for one epoch\n", " base_model.partial_fit(..., ...)\n", "\n", " # Calculate r-squared values on test sets\n", " summer_r_squared.append(base_model.score(..., ...))\n", " autumn_r_squared.append(base_model.score(..., ...))\n", "\n", "model = base_model\n", "plot_performance(num_epochs, summer_r_squared, autumn_r_squared)\n", "\n", "#predict for test sets\n", "summer_prices_predictions = model.predict(summer_days_test_norm)\n", "autumn_prices_predictions = model.predict(autumn_days_test_norm)\n", "\n", "plot_summer_autumn_predictions(summer_prices_predictions, autumn_prices_predictions)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "execution": {} }, "source": [ "[*Click for solution*](https://github.com/neuromatch/NeuroAI_Course/tree/main/tutorials/W2D4_Macrolearning/solutions/W2D4_Tutorial2_Solution_c3936e0c.py)\n", "\n", "*Example output:*\n", "\n", "Solution hint\n", "\n", "Solution hint\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "Notice how disruptive the change is for R-squared values — even one iteration is enough to drastically alter the performance. The model has learned to perform perfectly on the autumn data, while it completely messes up predictions for the summer days. Indeed, the model forgot the relationships for the old data and lost its predictive power while training on the new dataset. In the next section of the tutorial, we are going to explore a different approach—what if, instead of training sequentially, we train the model on both datasets together?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Submit your feedback\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "execution": {}, "tags": [ "hide-input" ] }, "outputs": [], "source": [ "# @title Submit your feedback\n", "content_review(f\"{feedback_prefix}_fitting_new_data\")" ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "---\n", "\n", "# Section 2: Joint training\n", "\n", "Estimated timing to here from start of tutorial: 10 minutes\n", "\n", "In this section, we are going to explore whether joint training on both datasets simultaneously, specified in different formats, improves predictive performance, thus allowing the model to perform well in both summer and autumn." ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "## Coding Exercise 2a: Sequential joint training" ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "In this coding exercise, let us take a look at the following setup: we will sample $K$ distinct random training examples from summer data and $K$ random examples from autumn, training the model in total on $2K$ examples.\n", "\n", "In sequential joint training, epochs of each data type are alternated. So, for example, the first epoch will be the $K$ examples from summer data, and then the second will be the $K$ examples from autumn data, then the summer data again, then autumn again, and so on.\n", "\n", "Please complete the partial fits for the corresponding data to implement sequential joint training in the coding exercise. We are using the `partial_fit` function to control the number of times the model \"meets\" data. With every new iteration, the model isn't a new one; it is partially fitted for all previous epochs." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {} }, "outputs": [], "source": [ "def sequential_training(K: int, num_epochs: int):\n", " \"\"\"\n", " Perform sequential training for a multi-layer perceptron (MLP) regression model.\n", " The function trains the model separately on the summer and autumn data in alternating epochs.\n", "\n", " Inputs:\n", " - K (int): The number of training examples to sample from each season (summer and autumn) in each epoch.\n", " - num_epochs (int): The number of training epochs.\n", "\n", " Returns:\n", " - model (MLPRegressor): The trained MLP regression model.\n", " - summer_r_squared (list): A list containing the R-squared values for the summer season at each epoch.\n", " - autumn_r_squared (list): A list containing the R-squared values for the autumn season at each epoch.\n", " \"\"\"\n", "\n", " model = MLPRegressor(hidden_layer_sizes=(100, 100), max_iter=10000, random_state=42, solver=\"lbfgs\")\n", " summer_r_squared = []\n", " autumn_r_squared = []\n", "\n", " for _ in tqdm(range(num_epochs // 2), desc=\"Training Progress\"):\n", " ###################################################################\n", " ## Fill out the following then remove\n", " raise NotImplementedError(\"Student exercise: sample indices for summer and autumn data and sequentially train on summer and then on autumn data\")\n", " ###################################################################\n", "\n", " # Sample random training examples from summer and autumn\n", " sampled_summer_indices = np.random.choice(np.arange(summer_days_train_norm.shape[0]), size=K, replace=False)\n", " sampled_autumn_indices = np.random.choice(np.arange(autumn_days_train_norm.shape[0]), size=K, replace=False)\n", "\n", " model.partial_fit(..., ...)\n", "\n", " summer_r_squared.append(model.score(summer_days_test_norm, summer_prices_test))\n", " autumn_r_squared.append(model.score(autumn_days_test_norm, autumn_prices_test))\n", "\n", " model.partial_fit(..., ...)\n", "\n", " summer_r_squared.append(model.score(summer_days_test_norm, summer_prices_test))\n", " autumn_r_squared.append(model.score(autumn_days_test_norm, autumn_prices_test))\n", "\n", " return model, summer_r_squared, autumn_r_squared\n", "\n", "set_seed(42)\n", "num_epochs = 100\n", "K = 30\n", "\n", "sequential_training_model, summer_r_squared, autumn_r_squared = sequential_training(K, num_epochs)\n", "\n", "plot_performance(num_epochs, summer_r_squared, autumn_r_squared)\n", "\n", "#predict for test sets\n", "summer_prices_predictions = sequential_training_model.predict(summer_days_test_norm)\n", "autumn_prices_predictions = sequential_training_model.predict(autumn_days_test_norm)\n", "\n", "plot_summer_autumn_predictions(summer_prices_predictions, autumn_prices_predictions)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "execution": {} }, "source": [ "[*Click for solution*](https://github.com/neuromatch/NeuroAI_Course/tree/main/tutorials/W2D4_Macrolearning/solutions/W2D4_Tutorial2_Solution_e2bacfd6.py)\n", "\n", "*Example output:*\n", "\n", "Solution hint\n", "\n", "Solution hint\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "As we can see, this approach performs better than first fully learning from the summer data and then learning from the autumn data. Sequential joint training helps maintain the model's performance across both datasets by continually refreshing its memory with information from both periods. This method prevents the model from completely forgetting the relationships learned from the first dataset while training on the second." ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "## Coding Exercise 2b: Interspersed training" ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "Now, we will try fully interspersed training. Unlike sequential joint training, in this approach, we will generate epochs that contain both summer and autumn data, exposing the model to both sets equally and simultaneously.\n", "\n", "In this exercise, you are tasked with completing the code snippets that correspond to creating the labels for the interspersed epochs and training the model. This method aims to integrate the data from both periods within each training epoch, which can help in achieving a more balanced and robust model performance across different seasonal datasets." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {} }, "outputs": [], "source": [ "def interspersed_training(K: int, num_epochs: int):\n", " \"\"\"\n", " Perform interspersed training for a multi-layer perceptron (MLP) regression model.\n", "\n", " Inputs:\n", " - K (int): The number of training examples to sample from each season (summer and autumn) in each epoch.\n", " - num_epochs (int): The number of training epochs.\n", "\n", " Returns:\n", " - model (MLPRegressor): The trained MLP regression model.\n", " - summer_r_squared (list): A list containing the R-squared values for the summer season at each epoch.\n", " - autumn_r_squared (list): A list containing the R-squared values for the autumn season at each epoch.\n", " \"\"\"\n", "\n", " model = MLPRegressor(hidden_layer_sizes=(100, 100), max_iter=10000, random_state = 42, solver = \"lbfgs\")\n", " summer_r_squared = []\n", " autumn_r_squared = []\n", "\n", " for _ in tqdm(range(num_epochs), desc=\"Training Progress\"):\n", "\n", " # Sample random training examples from summer and autumn\n", " sampled_summer_indices = np.random.choice(np.arange(summer_days_train_norm.shape[0]), size=K, replace=False)\n", " sampled_autumn_indices = np.random.choice(np.arange(autumn_days_train_norm.shape[0]), size=K, replace=False)\n", "\n", "\n", " mixed_days_train = np.expand_dims(np.append(autumn_days_train_norm[sampled_autumn_indices], summer_days_train_norm[sampled_summer_indices]), 1)\n", " ###################################################################\n", " ## Fill out the following then remove\n", " raise NotImplementedError(\"Student exercise: make price labels for mixed epochs and train\")\n", " ###################################################################\n", " mixed_prices_train = ...\n", " model.partial_fit(..., ...)\n", "\n", " summer_r_squared.append(model.score(summer_days_test_norm, summer_prices_test))\n", " autumn_r_squared.append(model.score(autumn_days_test_norm, autumn_prices_test))\n", "\n", " return model, summer_r_squared, autumn_r_squared\n", "\n", "set_seed(42)\n", "num_epochs = 50\n", "K = 30\n", "\n", "interspersed_training_model, summer_r_squared, autumn_r_squared = interspersed_training(K, num_epochs)\n", "\n", "plot_performance(num_epochs, summer_r_squared, autumn_r_squared)\n", "\n", "#predict for test sets\n", "summer_prices_predictions = interspersed_training_model.predict(summer_days_test_norm)\n", "autumn_prices_predictions = interspersed_training_model.predict(autumn_days_test_norm)\n", "\n", "plot_summer_autumn_predictions(summer_prices_predictions, autumn_prices_predictions)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "execution": {} }, "source": [ "[*Click for solution*](https://github.com/neuromatch/NeuroAI_Course/tree/main/tutorials/W2D4_Macrolearning/solutions/W2D4_Tutorial2_Solution_07232036.py)\n", "\n", "*Example output:*\n", "\n", "Solution hint\n", "\n", "Solution hint\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "### Coding Exercise 2 Discussion\n", "\n", "1. Note that the number of epochs is doubled in sequential training mode compared to interspersed mode. Why is this the case?\n", "2. Which training scheduler performed better in this particular example? Why do you think this occurred?" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "execution": {} }, "source": [ "[*Click for solution*](https://github.com/neuromatch/NeuroAI_Course/tree/main/tutorials/W2D4_Macrolearning/solutions/W2D4_Tutorial2_Solution_2ba000ea.py)\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "---\n", "# Summary\n", "\n", "*Estimated timing of tutorial: 25 minutes*\n", "\n", "Here is a summary of what we've learned:\n", "\n", "1. Simply continuing to train on a new data distribution causes catastrophic forgetting\n", "\n", "2. Joint training, wherein different datasets are interspersed to varying degrees, helps fight catastrophic forgetting.\n", "\n", "You can explore more advanced methods of continual learning in the following resources:\n", "\n", "- [Continual Lifelong Learning with Neural Networks: A Review](https://arxiv.org/pdf/1802.07569)\n", "- [ContinualAI](https://www.continualai.org/)\n", "- [A Comprehensive Survey of Continual Learning: Theory, Method and Application](https://arxiv.org/pdf/2302.00487)\n", "- [Brain-inspired replay for continual learning with artificial neural networks](https://www.nature.com/articles/s41467-020-17866-2)" ] } ], "metadata": { "colab": { "collapsed_sections": [], "include_colab_link": true, "name": "W2D4_Tutorial2", "provenance": [], "toc_visible": true }, "kernel": { "display_name": "Python 3", "language": "python", "name": "python3" }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.19" } }, "nbformat": 4, "nbformat_minor": 4 }