Skip to content

Instantly share code, notes, and snippets.

@NassimElH01
Last active December 14, 2024 10:46
Show Gist options
  • Select an option

  • Save NassimElH01/0864262cf43f4be38a7bc686ff78c6e3 to your computer and use it in GitHub Desktop.

Select an option

Save NassimElH01/0864262cf43f4be38a7bc686ff78c6e3 to your computer and use it in GitHub Desktop.
Nassim - Linear_Regression.ipynb ( OPG 1A )
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/NassimElH01/0864262cf43f4be38a7bc686ff78c6e3/nassim-linear_regression-ipynb-opg-1a.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"### Ensure that we use Python 3.7 or above:"
],
"metadata": {
"id": "cifewyhF9BAO"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "8KOQTcKX8bg2"
},
"outputs": [],
"source": [
"import sys\n",
"\n",
"assert sys.version_info >= (3, 7)"
]
},
{
"cell_type": "markdown",
"source": [
"### Ensure that we use at least Scikit-Learn 1.0.1"
],
"metadata": {
"id": "MdUYfAR79gu8"
}
},
{
"cell_type": "code",
"source": [
"from packaging import version\n",
"import sklearn\n",
"\n",
"assert version.parse(sklearn.__version__) >= version.parse(\"1.0.1\")"
],
"metadata": {
"id": "cEyMjLcU86Dw"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Let us set up the fonts in mathplotlib"
],
"metadata": {
"id": "y2xAwFNX-Q0D"
}
},
{
"cell_type": "code",
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"plt.rc('font', size=12)\n",
"plt.rc('axes', labelsize=14, titlesize=14)\n",
"plt.rc('legend', fontsize=12)\n",
"plt.rc('xtick', labelsize=10)\n",
"plt.rc('ytick', labelsize=10)"
],
"metadata": {
"id": "gwi4DUJV-K1b"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Import important libraries 'numpy' and 'pandas'"
],
"metadata": {
"id": "tvKLkp_dHHwf"
}
},
{
"cell_type": "code",
"source": [
"import numpy as np\n",
"import pandas as pd"
],
"metadata": {
"id": "gK4Onnhy-6o8"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"# Make reference to life satifaction data file"
],
"metadata": {
"id": "zZoc-LzQHYcB"
}
},
{
"cell_type": "code",
"source": [
"datafile = \"https://github.com/ageron/data/raw/main/lifesat/lifesat.csv\"\n",
"\n",
"##Try with this one later\n",
"##datafile = \"https://raw.githubusercontent.com/jpandersen61/Machine-Learning/main/InjuredandkilledintrafikDK.csv\""
],
"metadata": {
"id": "1EB-X59l_zgg"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Excercise(s):"
],
"metadata": {
"id": "GLUiNLM-HqXJ"
}
},
{
"cell_type": "markdown",
"source": [
"Try viewing this file in your internet browser"
],
"metadata": {
"id": "Xd2829tIAimY"
}
},
{
"cell_type": "markdown",
"source": [
"### Load the data file"
],
"metadata": {
"id": "yWx2mhWjIMq4"
}
},
{
"cell_type": "code",
"source": [
"lifesat = pd.read_csv(datafile)\n"
],
"metadata": {
"id": "9g1ETm1Y_Q2c"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Excercise(s)"
],
"metadata": {
"id": "JBJfOo9HFmhK"
}
},
{
"cell_type": "markdown",
"source": [
"1. Make a new code cell below and evaluate 'lifesat'"
],
"metadata": {
"id": "IuZ12x0WF6QK"
}
},
{
"cell_type": "code",
"source": [
"lifesat"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 886
},
"id": "PBQsMeE7Goed",
"outputId": "3d8aa509-18bc-4e29-cacb-bf4f32cc5030"
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" Country GDP per capita (USD) Life satisfaction\n",
"0 Russia 26456.387938 5.8\n",
"1 Greece 27287.083401 5.4\n",
"2 Turkey 28384.987785 5.5\n",
"3 Latvia 29932.493910 5.9\n",
"4 Hungary 31007.768407 5.6\n",
"5 Portugal 32181.154537 5.4\n",
"6 Poland 32238.157259 6.1\n",
"7 Estonia 35638.421351 5.7\n",
"8 Spain 36215.447591 6.3\n",
"9 Slovenia 36547.738956 5.9\n",
"10 Lithuania 36732.034744 5.9\n",
"11 Israel 38341.307570 7.2\n",
"12 Italy 38992.148381 6.0\n",
"13 United Kingdom 41627.129269 6.8\n",
"14 France 42025.617373 6.5\n",
"15 New Zealand 42404.393738 7.3\n",
"16 Canada 45856.625626 7.4\n",
"17 Finland 47260.800458 7.6\n",
"18 Belgium 48210.033111 6.9\n",
"19 Australia 48697.837028 7.3\n",
"20 Sweden 50683.323510 7.3\n",
"21 Germany 50922.358023 7.0\n",
"22 Austria 51935.603862 7.1\n",
"23 Iceland 52279.728851 7.5\n",
"24 Netherlands 54209.563836 7.4\n",
"25 Denmark 55938.212809 7.6\n",
"26 United States 60235.728492 6.9"
],
"text/html": [
"\n",
" <div id=\"df-4bd382f6-8b82-462e-87e7-791496cf5c32\" class=\"colab-df-container\">\n",
" <div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Country</th>\n",
" <th>GDP per capita (USD)</th>\n",
" <th>Life satisfaction</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Russia</td>\n",
" <td>26456.387938</td>\n",
" <td>5.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Greece</td>\n",
" <td>27287.083401</td>\n",
" <td>5.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Turkey</td>\n",
" <td>28384.987785</td>\n",
" <td>5.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Latvia</td>\n",
" <td>29932.493910</td>\n",
" <td>5.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Hungary</td>\n",
" <td>31007.768407</td>\n",
" <td>5.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Portugal</td>\n",
" <td>32181.154537</td>\n",
" <td>5.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Poland</td>\n",
" <td>32238.157259</td>\n",
" <td>6.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Estonia</td>\n",
" <td>35638.421351</td>\n",
" <td>5.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Spain</td>\n",
" <td>36215.447591</td>\n",
" <td>6.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Slovenia</td>\n",
" <td>36547.738956</td>\n",
" <td>5.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Lithuania</td>\n",
" <td>36732.034744</td>\n",
" <td>5.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Israel</td>\n",
" <td>38341.307570</td>\n",
" <td>7.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>Italy</td>\n",
" <td>38992.148381</td>\n",
" <td>6.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>United Kingdom</td>\n",
" <td>41627.129269</td>\n",
" <td>6.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>France</td>\n",
" <td>42025.617373</td>\n",
" <td>6.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>New Zealand</td>\n",
" <td>42404.393738</td>\n",
" <td>7.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>Canada</td>\n",
" <td>45856.625626</td>\n",
" <td>7.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>Finland</td>\n",
" <td>47260.800458</td>\n",
" <td>7.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>Belgium</td>\n",
" <td>48210.033111</td>\n",
" <td>6.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>Australia</td>\n",
" <td>48697.837028</td>\n",
" <td>7.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>Sweden</td>\n",
" <td>50683.323510</td>\n",
" <td>7.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>Germany</td>\n",
" <td>50922.358023</td>\n",
" <td>7.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>Austria</td>\n",
" <td>51935.603862</td>\n",
" <td>7.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>Iceland</td>\n",
" <td>52279.728851</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>Netherlands</td>\n",
" <td>54209.563836</td>\n",
" <td>7.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>Denmark</td>\n",
" <td>55938.212809</td>\n",
" <td>7.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td>United States</td>\n",
" <td>60235.728492</td>\n",
" <td>6.9</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>\n",
" <div class=\"colab-df-buttons\">\n",
"\n",
" <div class=\"colab-df-container\">\n",
" <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-4bd382f6-8b82-462e-87e7-791496cf5c32')\"\n",
" title=\"Convert this dataframe to an interactive table.\"\n",
" style=\"display:none;\">\n",
"\n",
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
" <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
" </svg>\n",
" </button>\n",
"\n",
" <style>\n",
" .colab-df-container {\n",
" display:flex;\n",
" gap: 12px;\n",
" }\n",
"\n",
" .colab-df-convert {\n",
" background-color: #E8F0FE;\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: #1967D2;\n",
" height: 32px;\n",
" padding: 0 0 0 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-convert:hover {\n",
" background-color: #E2EBFA;\n",
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: #174EA6;\n",
" }\n",
"\n",
" .colab-df-buttons div {\n",
" margin-bottom: 4px;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert {\n",
" background-color: #3B4455;\n",
" fill: #D2E3FC;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert:hover {\n",
" background-color: #434B5C;\n",
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
" fill: #FFFFFF;\n",
" }\n",
" </style>\n",
"\n",
" <script>\n",
" const buttonEl =\n",
" document.querySelector('#df-4bd382f6-8b82-462e-87e7-791496cf5c32 button.colab-df-convert');\n",
" buttonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
"\n",
" async function convertToInteractive(key) {\n",
" const element = document.querySelector('#df-4bd382f6-8b82-462e-87e7-791496cf5c32');\n",
" const dataTable =\n",
" await google.colab.kernel.invokeFunction('convertToInteractive',\n",
" [key], {});\n",
" if (!dataTable) return;\n",
"\n",
" const docLinkHtml = 'Like what you see? Visit the ' +\n",
" '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
" + ' to learn more about interactive tables.';\n",
" element.innerHTML = '';\n",
" dataTable['output_type'] = 'display_data';\n",
" await google.colab.output.renderOutput(dataTable, element);\n",
" const docLink = document.createElement('div');\n",
" docLink.innerHTML = docLinkHtml;\n",
" element.appendChild(docLink);\n",
" }\n",
" </script>\n",
" </div>\n",
"\n",
"\n",
"<div id=\"df-384f2af9-3afe-40a8-9f13-a61af7bfc0c1\">\n",
" <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-384f2af9-3afe-40a8-9f13-a61af7bfc0c1')\"\n",
" title=\"Suggest charts\"\n",
" style=\"display:none;\">\n",
"\n",
"<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
" width=\"24px\">\n",
" <g>\n",
" <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
" </g>\n",
"</svg>\n",
" </button>\n",
"\n",
"<style>\n",
" .colab-df-quickchart {\n",
" --bg-color: #E8F0FE;\n",
" --fill-color: #1967D2;\n",
" --hover-bg-color: #E2EBFA;\n",
" --hover-fill-color: #174EA6;\n",
" --disabled-fill-color: #AAA;\n",
" --disabled-bg-color: #DDD;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-quickchart {\n",
" --bg-color: #3B4455;\n",
" --fill-color: #D2E3FC;\n",
" --hover-bg-color: #434B5C;\n",
" --hover-fill-color: #FFFFFF;\n",
" --disabled-bg-color: #3B4455;\n",
" --disabled-fill-color: #666;\n",
" }\n",
"\n",
" .colab-df-quickchart {\n",
" background-color: var(--bg-color);\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: var(--fill-color);\n",
" height: 32px;\n",
" padding: 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-quickchart:hover {\n",
" background-color: var(--hover-bg-color);\n",
" box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: var(--button-hover-fill-color);\n",
" }\n",
"\n",
" .colab-df-quickchart-complete:disabled,\n",
" .colab-df-quickchart-complete:disabled:hover {\n",
" background-color: var(--disabled-bg-color);\n",
" fill: var(--disabled-fill-color);\n",
" box-shadow: none;\n",
" }\n",
"\n",
" .colab-df-spinner {\n",
" border: 2px solid var(--fill-color);\n",
" border-color: transparent;\n",
" border-bottom-color: var(--fill-color);\n",
" animation:\n",
" spin 1s steps(1) infinite;\n",
" }\n",
"\n",
" @keyframes spin {\n",
" 0% {\n",
" border-color: transparent;\n",
" border-bottom-color: var(--fill-color);\n",
" border-left-color: var(--fill-color);\n",
" }\n",
" 20% {\n",
" border-color: transparent;\n",
" border-left-color: var(--fill-color);\n",
" border-top-color: var(--fill-color);\n",
" }\n",
" 30% {\n",
" border-color: transparent;\n",
" border-left-color: var(--fill-color);\n",
" border-top-color: var(--fill-color);\n",
" border-right-color: var(--fill-color);\n",
" }\n",
" 40% {\n",
" border-color: transparent;\n",
" border-right-color: var(--fill-color);\n",
" border-top-color: var(--fill-color);\n",
" }\n",
" 60% {\n",
" border-color: transparent;\n",
" border-right-color: var(--fill-color);\n",
" }\n",
" 80% {\n",
" border-color: transparent;\n",
" border-right-color: var(--fill-color);\n",
" border-bottom-color: var(--fill-color);\n",
" }\n",
" 90% {\n",
" border-color: transparent;\n",
" border-bottom-color: var(--fill-color);\n",
" }\n",
" }\n",
"</style>\n",
"\n",
" <script>\n",
" async function quickchart(key) {\n",
" const quickchartButtonEl =\n",
" document.querySelector('#' + key + ' button');\n",
" quickchartButtonEl.disabled = true; // To prevent multiple clicks.\n",
" quickchartButtonEl.classList.add('colab-df-spinner');\n",
" try {\n",
" const charts = await google.colab.kernel.invokeFunction(\n",
" 'suggestCharts', [key], {});\n",
" } catch (error) {\n",
" console.error('Error during call to suggestCharts:', error);\n",
" }\n",
" quickchartButtonEl.classList.remove('colab-df-spinner');\n",
" quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
" }\n",
" (() => {\n",
" let quickchartButtonEl =\n",
" document.querySelector('#df-384f2af9-3afe-40a8-9f13-a61af7bfc0c1 button');\n",
" quickchartButtonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
" })();\n",
" </script>\n",
"</div>\n",
"\n",
" <div id=\"id_369dbce8-7881-4571-9f77-6dae2f683a7c\">\n",
" <style>\n",
" .colab-df-generate {\n",
" background-color: #E8F0FE;\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: #1967D2;\n",
" height: 32px;\n",
" padding: 0 0 0 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-generate:hover {\n",
" background-color: #E2EBFA;\n",
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: #174EA6;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-generate {\n",
" background-color: #3B4455;\n",
" fill: #D2E3FC;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-generate:hover {\n",
" background-color: #434B5C;\n",
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
" fill: #FFFFFF;\n",
" }\n",
" </style>\n",
" <button class=\"colab-df-generate\" onclick=\"generateWithVariable('lifesat')\"\n",
" title=\"Generate code using this dataframe.\"\n",
" style=\"display:none;\">\n",
"\n",
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
" width=\"24px\">\n",
" <path d=\"M7,19H8.4L18.45,9,17,7.55,7,17.6ZM5,21V16.75L18.45,3.32a2,2,0,0,1,2.83,0l1.4,1.43a1.91,1.91,0,0,1,.58,1.4,1.91,1.91,0,0,1-.58,1.4L9.25,21ZM18.45,9,17,7.55Zm-12,3A5.31,5.31,0,0,0,4.9,8.1,5.31,5.31,0,0,0,1,6.5,5.31,5.31,0,0,0,4.9,4.9,5.31,5.31,0,0,0,6.5,1,5.31,5.31,0,0,0,8.1,4.9,5.31,5.31,0,0,0,12,6.5,5.46,5.46,0,0,0,6.5,12Z\"/>\n",
" </svg>\n",
" </button>\n",
" <script>\n",
" (() => {\n",
" const buttonEl =\n",
" document.querySelector('#id_369dbce8-7881-4571-9f77-6dae2f683a7c button.colab-df-generate');\n",
" buttonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
"\n",
" buttonEl.onclick = () => {\n",
" google.colab.notebook.generateWithVariable('lifesat');\n",
" }\n",
" })();\n",
" </script>\n",
" </div>\n",
"\n",
" </div>\n",
" </div>\n"
],
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "dataframe",
"variable_name": "lifesat",
"summary": "{\n \"name\": \"lifesat\",\n \"rows\": 27,\n \"fields\": [\n {\n \"column\": \"Country\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 27,\n \"samples\": [\n \"Spain\",\n \"United Kingdom\",\n \"Slovenia\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"GDP per capita (USD)\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 9631.452318546564,\n \"min\": 26456.3879381321,\n \"max\": 60235.7284916969,\n \"num_unique_values\": 27,\n \"samples\": [\n 36215.4475907307,\n 41627.129269425,\n 36547.7389559849\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Life satisfaction\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 0.7656068482934607,\n \"min\": 5.4,\n \"max\": 7.6,\n \"num_unique_values\": 19,\n \"samples\": [\n 5.8,\n 6.1,\n 6.5\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}"
}
},
"metadata": {},
"execution_count": 8
}
]
},
{
"cell_type": "markdown",
"source": [
"2. Get some information about the kind of data structure, that life satisfaction data is store in. Usefuk to know when using the data afterwards in Python: Make a new code cell below and evaluate 'type(lifesat)'"
],
"metadata": {
"id": "aDwhNpA3Cyp0"
}
},
{
"cell_type": "code",
"source": [
"type(lifesat)"
],
"metadata": {
"id": "kz_axOl2HSLG",
"outputId": "db8b77b9-deb5-4576-9089-365e576f784b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 203
}
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"pandas.core.frame.DataFrame"
],
"text/html": [
"<div style=\"max-width:800px; border: 1px solid var(--colab-border-color);\"><style>\n",
" pre.function-repr-contents {\n",
" overflow-x: auto;\n",
" padding: 8px 12px;\n",
" max-height: 500px;\n",
" }\n",
"\n",
" pre.function-repr-contents.function-repr-contents-collapsed {\n",
" cursor: pointer;\n",
" max-height: 100px;\n",
" }\n",
" </style>\n",
" <pre style=\"white-space: initial; background:\n",
" var(--colab-secondary-surface-color); padding: 8px 12px;\n",
" border-bottom: 1px solid var(--colab-border-color);\"><b>pandas.core.frame.DataFrame</b><br/>def __init__(data=None, index: Axes | None=None, columns: Axes | None=None, dtype: Dtype | None=None, copy: bool | None=None) -&gt; None</pre><pre class=\"function-repr-contents function-repr-contents-collapsed\" style=\"\"><a class=\"filepath\" style=\"display:none\" href=\"#\">/usr/local/lib/python3.10/dist-packages/pandas/core/frame.py</a>Two-dimensional, size-mutable, potentially heterogeneous tabular data.\n",
"\n",
"Data structure also contains labeled axes (rows and columns).\n",
"Arithmetic operations align on both row and column labels. Can be\n",
"thought of as a dict-like container for Series objects. The primary\n",
"pandas data structure.\n",
"\n",
"Parameters\n",
"----------\n",
"data : ndarray (structured or homogeneous), Iterable, dict, or DataFrame\n",
" Dict can contain Series, arrays, constants, dataclass or list-like objects. If\n",
" data is a dict, column order follows insertion-order. If a dict contains Series\n",
" which have an index defined, it is aligned by its index. This alignment also\n",
" occurs if data is a Series or a DataFrame itself. Alignment is done on\n",
" Series/DataFrame inputs.\n",
"\n",
" If data is a list of dicts, column order follows insertion-order.\n",
"\n",
"index : Index or array-like\n",
" Index to use for resulting frame. Will default to RangeIndex if\n",
" no indexing information part of input data and no index provided.\n",
"columns : Index or array-like\n",
" Column labels to use for resulting frame when data does not have them,\n",
" defaulting to RangeIndex(0, 1, 2, ..., n). If data contains column labels,\n",
" will perform column selection instead.\n",
"dtype : dtype, default None\n",
" Data type to force. Only a single dtype is allowed. If None, infer.\n",
"copy : bool or None, default None\n",
" Copy data from inputs.\n",
" For dict data, the default of None behaves like ``copy=True``. For DataFrame\n",
" or 2d ndarray input, the default of None behaves like ``copy=False``.\n",
" If data is a dict containing one or more Series (possibly of different dtypes),\n",
" ``copy=False`` will ensure that these inputs are not copied.\n",
"\n",
" .. versionchanged:: 1.3.0\n",
"\n",
"See Also\n",
"--------\n",
"DataFrame.from_records : Constructor from tuples, also record arrays.\n",
"DataFrame.from_dict : From dicts of Series, arrays, or dicts.\n",
"read_csv : Read a comma-separated values (csv) file into DataFrame.\n",
"read_table : Read general delimited file into DataFrame.\n",
"read_clipboard : Read text from clipboard into DataFrame.\n",
"\n",
"Notes\n",
"-----\n",
"Please reference the :ref:`User Guide &lt;basics.dataframe&gt;` for more information.\n",
"\n",
"Examples\n",
"--------\n",
"Constructing DataFrame from a dictionary.\n",
"\n",
"&gt;&gt;&gt; d = {&#x27;col1&#x27;: [1, 2], &#x27;col2&#x27;: [3, 4]}\n",
"&gt;&gt;&gt; df = pd.DataFrame(data=d)\n",
"&gt;&gt;&gt; df\n",
" col1 col2\n",
"0 1 3\n",
"1 2 4\n",
"\n",
"Notice that the inferred dtype is int64.\n",
"\n",
"&gt;&gt;&gt; df.dtypes\n",
"col1 int64\n",
"col2 int64\n",
"dtype: object\n",
"\n",
"To enforce a single dtype:\n",
"\n",
"&gt;&gt;&gt; df = pd.DataFrame(data=d, dtype=np.int8)\n",
"&gt;&gt;&gt; df.dtypes\n",
"col1 int8\n",
"col2 int8\n",
"dtype: object\n",
"\n",
"Constructing DataFrame from a dictionary including Series:\n",
"\n",
"&gt;&gt;&gt; d = {&#x27;col1&#x27;: [0, 1, 2, 3], &#x27;col2&#x27;: pd.Series([2, 3], index=[2, 3])}\n",
"&gt;&gt;&gt; pd.DataFrame(data=d, index=[0, 1, 2, 3])\n",
" col1 col2\n",
"0 0 NaN\n",
"1 1 NaN\n",
"2 2 2.0\n",
"3 3 3.0\n",
"\n",
"Constructing DataFrame from numpy ndarray:\n",
"\n",
"&gt;&gt;&gt; df2 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),\n",
"... columns=[&#x27;a&#x27;, &#x27;b&#x27;, &#x27;c&#x27;])\n",
"&gt;&gt;&gt; df2\n",
" a b c\n",
"0 1 2 3\n",
"1 4 5 6\n",
"2 7 8 9\n",
"\n",
"Constructing DataFrame from a numpy ndarray that has labeled columns:\n",
"\n",
"&gt;&gt;&gt; data = np.array([(1, 2, 3), (4, 5, 6), (7, 8, 9)],\n",
"... dtype=[(&quot;a&quot;, &quot;i4&quot;), (&quot;b&quot;, &quot;i4&quot;), (&quot;c&quot;, &quot;i4&quot;)])\n",
"&gt;&gt;&gt; df3 = pd.DataFrame(data, columns=[&#x27;c&#x27;, &#x27;a&#x27;])\n",
"...\n",
"&gt;&gt;&gt; df3\n",
" c a\n",
"0 3 1\n",
"1 6 4\n",
"2 9 7\n",
"\n",
"Constructing DataFrame from dataclass:\n",
"\n",
"&gt;&gt;&gt; from dataclasses import make_dataclass\n",
"&gt;&gt;&gt; Point = make_dataclass(&quot;Point&quot;, [(&quot;x&quot;, int), (&quot;y&quot;, int)])\n",
"&gt;&gt;&gt; pd.DataFrame([Point(0, 0), Point(0, 3), Point(2, 3)])\n",
" x y\n",
"0 0 0\n",
"1 0 3\n",
"2 2 3\n",
"\n",
"Constructing DataFrame from Series/DataFrame:\n",
"\n",
"&gt;&gt;&gt; ser = pd.Series([1, 2, 3], index=[&quot;a&quot;, &quot;b&quot;, &quot;c&quot;])\n",
"&gt;&gt;&gt; df = pd.DataFrame(data=ser, index=[&quot;a&quot;, &quot;c&quot;])\n",
"&gt;&gt;&gt; df\n",
" 0\n",
"a 1\n",
"c 3\n",
"\n",
"&gt;&gt;&gt; df1 = pd.DataFrame([1, 2, 3], index=[&quot;a&quot;, &quot;b&quot;, &quot;c&quot;], columns=[&quot;x&quot;])\n",
"&gt;&gt;&gt; df2 = pd.DataFrame(data=df1, index=[&quot;a&quot;, &quot;c&quot;])\n",
"&gt;&gt;&gt; df2\n",
" x\n",
"a 1\n",
"c 3</pre>\n",
" <script>\n",
" if (google.colab.kernel.accessAllowed && google.colab.files && google.colab.files.view) {\n",
" for (const element of document.querySelectorAll('.filepath')) {\n",
" element.style.display = 'block'\n",
" element.onclick = (event) => {\n",
" event.preventDefault();\n",
" event.stopPropagation();\n",
" google.colab.files.view(element.textContent, 509);\n",
" };\n",
" }\n",
" }\n",
" for (const element of document.querySelectorAll('.function-repr-contents')) {\n",
" element.onclick = (event) => {\n",
" event.preventDefault();\n",
" event.stopPropagation();\n",
" element.classList.toggle('function-repr-contents-collapsed');\n",
" };\n",
" }\n",
" </script>\n",
" </div>"
]
},
"metadata": {},
"execution_count": 9
}
]
},
{
"cell_type": "markdown",
"source": [
"### Extract the learning set 'X' and its labels 'y'"
],
"metadata": {
"id": "cj4YKu6oFI3Q"
}
},
{
"cell_type": "code",
"source": [
"X = lifesat[[\"GDP per capita (USD)\"]].values\n",
"y = lifesat[[\"Life satisfaction\"]].values\n",
"\n"
],
"metadata": {
"id": "NtovVwne_pM7"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Exercises:"
],
"metadata": {
"id": "HSYpesjdEwCn"
}
},
{
"cell_type": "markdown",
"source": [
"Make a new code cell below and evaluate 'X'\n",
"\n",
"\n",
"\n"
],
"metadata": {
"id": "pcR9Ia3VDmWw"
}
},
{
"cell_type": "code",
"source": [
"X"
],
"metadata": {
"id": "b49rv5CGHeHQ",
"outputId": "ef1ed5cd-97f0-468c-ab7a-1b8cea602178",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[26456.38793813],\n",
" [27287.08340093],\n",
" [28384.98778463],\n",
" [29932.49391006],\n",
" [31007.76840654],\n",
" [32181.15453723],\n",
" [32238.15725928],\n",
" [35638.42135118],\n",
" [36215.44759073],\n",
" [36547.73895598],\n",
" [36732.03474403],\n",
" [38341.30757041],\n",
" [38992.14838075],\n",
" [41627.12926943],\n",
" [42025.61737306],\n",
" [42404.39373816],\n",
" [45856.62562648],\n",
" [47260.80045844],\n",
" [48210.03311134],\n",
" [48697.83702825],\n",
" [50683.32350972],\n",
" [50922.35802345],\n",
" [51935.60386182],\n",
" [52279.72885136],\n",
" [54209.56383573],\n",
" [55938.2128086 ],\n",
" [60235.7284917 ]])"
]
},
"metadata": {},
"execution_count": 11
}
]
},
{
"cell_type": "markdown",
"source": [
"Make a new code cell below and evaluate 'y'"
],
"metadata": {
"id": "vDQPFB9AEMMR"
}
},
{
"cell_type": "code",
"source": [
"y"
],
"metadata": {
"id": "x4tI9BKoHe20",
"outputId": "6780fbd2-558e-4938-a639-7c5a2a229aae",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[5.8],\n",
" [5.4],\n",
" [5.5],\n",
" [5.9],\n",
" [5.6],\n",
" [5.4],\n",
" [6.1],\n",
" [5.7],\n",
" [6.3],\n",
" [5.9],\n",
" [5.9],\n",
" [7.2],\n",
" [6. ],\n",
" [6.8],\n",
" [6.5],\n",
" [7.3],\n",
" [7.4],\n",
" [7.6],\n",
" [6.9],\n",
" [7.3],\n",
" [7.3],\n",
" [7. ],\n",
" [7.1],\n",
" [7.5],\n",
" [7.4],\n",
" [7.6],\n",
" [6.9]])"
]
},
"metadata": {},
"execution_count": 12
}
]
},
{
"cell_type": "markdown",
"source": [
"### Let us plot the life satisfaction data and introduce a model"
],
"metadata": {
"id": "WGVG7gP1EcIx"
}
},
{
"cell_type": "code",
"source": [
"lifesat.plot(kind='scatter', grid=True,\n",
" x=\"GDP per capita (USD)\", y=\"Life satisfaction\")\n",
"plt.axis([23_500, 62_500, 4, 9])\n",
"plt.show()"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 460
},
"id": "uuQq8nUOERM8",
"outputId": "4f4c39ce-94a8-46b5-afcd-4fba9f50bf3d"
},
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
],
"image/png": "\n"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": [
"### Excercise(s):"
],
"metadata": {
"id": "JIXI1IL6JvVm"
}
},
{
"cell_type": "markdown",
"source": [
"Define various 't0' and 't1' (thetas) and observe how it affects our model"
],
"metadata": {
"id": "EA7d6lEBJ4OK"
}
},
{
"cell_type": "code",
"source": [
"t0, t1 = 6.75, 6.78e-05\n",
"\n",
"from sklearn import linear_model\n",
"min_gdp = 23_500\n",
"max_gdp = 62_500\n",
"min_life_sat = 4\n",
"max_life_sat = 9\n",
"\n",
"lifesat.plot(kind='scatter', figsize=(5, 3), grid=True,\n",
" x=\"GDP per capita (USD)\", y=\"Life satisfaction\")\n",
"\n",
"X = np.linspace(min_gdp, max_gdp, 1000)\n",
"plt.plot(X, t0 + t1 * X, \"b\")\n",
"\n",
"plt.text(max_gdp - 20_000, min_life_sat + 1.9,\n",
" fr\"$\\theta_0 = {t0:.2f}$\", color=\"b\")\n",
"plt.text(max_gdp - 20_000, min_life_sat + 1.3,\n",
" fr\"$\\theta_1 = {t1 * 1e5:.2f} \\times 10^{{-5}}$\", color=\"b\")\n",
"\n",
"plt.axis([min_gdp, max_gdp, min_life_sat, max_life_sat])\n",
"\n",
"plt.show()"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 321
},
"id": "HXN6ixwFIzii",
"outputId": "8826fbe4-da0c-4cf0-e8f0-bf539c07341b"
},
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 500x300 with 1 Axes>"
],
"image/png": "\n"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": [
"### Establish the model"
],
"metadata": {
"id": "J0iRLACuXH79"
}
},
{
"cell_type": "code",
"source": [
"from sklearn import linear_model\n",
"\n",
"X_features = lifesat[[\"GDP per capita (USD)\"]]\n",
"y_labels = lifesat[[\"Life satisfaction\"]]\n",
"\n",
"lin1 = linear_model.LinearRegression()\n",
"lin1.fit(X_features, y_labels)\n",
"\n",
"t0, t1 = lin1.intercept_[0], lin1.coef_[0][0]\n",
"print(f\"θ0={t0:.2f}, θ1={t1:.2e}\")"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "BylN6risXh1C",
"outputId": "3a2c34bf-8dd5-4085-fe39-95da191b0ba4"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"θ0=3.75, θ1=6.78e-05\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"X_pred = 40000\n",
"y_pred = lin1.predict([[X_pred]])[0,0]"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "i7nan8QdT8ZO",
"outputId": "c7e06e12-5d31-473e-a852-1a159e98af04"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"/usr/local/lib/python3.10/dist-packages/sklearn/base.py:493: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names\n",
" warnings.warn(\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"y_pred"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "xgXDykJWUgcU",
"outputId": "050649f4-2f8b-455a-af03-b1adbbee64e2"
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"6.4606093051133975"
]
},
"metadata": {},
"execution_count": 17
}
]
},
{
"cell_type": "code",
"source": [
"from sklearn import linear_model\n",
"min_gdp = 23_500\n",
"max_gdp = 62_500\n",
"min_life_sat = 4\n",
"max_life_sat = 9\n",
"\n",
"lifesat.plot(kind='scatter', figsize=(5, 3), grid=True,\n",
" x=\"GDP per capita (USD)\", y=\"Life satisfaction\")\n",
"\n",
"X = np.linspace(min_gdp, max_gdp, 1000)\n",
"plt.plot(X, t0 + t1 * X, \"b\")\n",
"\n",
"plt.text(max_gdp - 20_000, min_life_sat + 1.9,\n",
" fr\"$\\theta_0 = {t0:.2f}$\", color=\"b\")\n",
"plt.text(max_gdp - 20_000, min_life_sat + 1.3,\n",
" fr\"$\\theta_1 = {t1 * 1e5:.2f} \\times 10^{{-5}}$\", color=\"b\")\n",
"\n",
"plt.axis([min_gdp, max_gdp, min_life_sat, max_life_sat])\n",
"\n",
"plt.plot([X_pred, X_pred],\n",
" [min_life_sat, y_pred], \"r--\")\n",
"plt.plot(X_pred, y_pred, \"ro\")\n",
"\n",
"plt.show()"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 321
},
"id": "J2vqshOxZMXt",
"outputId": "83afec30-4995-4de4-ac88-293d8eb75451"
},
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 500x300 with 1 Axes>"
],
"image/png": "\n"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": [
"### Excercise(s):\n"
],
"metadata": {
"id": "mrttlJ7yPI7m"
}
},
{
"cell_type": "markdown",
"source": [
"1. Try with another dataset https://raw.githubusercontent.com/jpandersen61/Machine-Learning/main/InjuredandkilledintrafikDK.csv"
],
"metadata": {
"id": "rvGBeixgPRTH"
}
},
{
"cell_type": "code",
"source": [
"datafiles = \"https://raw.githubusercontent.com/jpandersen61/Machine-Learning/main/InjuredandkilledintrafikDK.csv\""
],
"metadata": {
"id": "O4NPUCygPYr3"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"Injuries = pd.read_csv(datafiles)"
],
"metadata": {
"id": "iiZYvjhHJboV"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"Injuries"
],
"metadata": {
"id": "IZY9mc8-JlPg",
"outputId": "4367e168-c2c6-4ec5-f4bd-665c213c9783",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 731
}
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" year quantity\n",
"0 2001 8896\n",
"1 2002 9254\n",
"2 2003 8844\n",
"3 2004 7915\n",
"4 2005 6919\n",
"5 2006 6821\n",
"6 2007 7062\n",
"7 2008 6329\n",
"8 2009 5250\n",
"9 2010 4408\n",
"10 2011 4259\n",
"11 2012 3778\n",
"12 2013 3585\n",
"13 2014 3375\n",
"14 2015 3334\n",
"15 2016 3439\n",
"16 2017 3318\n",
"17 2018 3458\n",
"18 2019 3275\n",
"19 2020 2914\n",
"20 2021 2737\n",
"21 2022 2917"
],
"text/html": [
"\n",
" <div id=\"df-0fdcf617-a70f-4c03-abfa-42f004c47cbb\" class=\"colab-df-container\">\n",
" <div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>year</th>\n",
" <th>quantity</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2001</td>\n",
" <td>8896</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2002</td>\n",
" <td>9254</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2003</td>\n",
" <td>8844</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2004</td>\n",
" <td>7915</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2005</td>\n",
" <td>6919</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>2006</td>\n",
" <td>6821</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>2007</td>\n",
" <td>7062</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>2008</td>\n",
" <td>6329</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>2009</td>\n",
" <td>5250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>2010</td>\n",
" <td>4408</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>2011</td>\n",
" <td>4259</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>2012</td>\n",
" <td>3778</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>2013</td>\n",
" <td>3585</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>2014</td>\n",
" <td>3375</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>2015</td>\n",
" <td>3334</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>2016</td>\n",
" <td>3439</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>2017</td>\n",
" <td>3318</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>2018</td>\n",
" <td>3458</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>2019</td>\n",
" <td>3275</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>2020</td>\n",
" <td>2914</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>2021</td>\n",
" <td>2737</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>2022</td>\n",
" <td>2917</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>\n",
" <div class=\"colab-df-buttons\">\n",
"\n",
" <div class=\"colab-df-container\">\n",
" <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-0fdcf617-a70f-4c03-abfa-42f004c47cbb')\"\n",
" title=\"Convert this dataframe to an interactive table.\"\n",
" style=\"display:none;\">\n",
"\n",
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
" <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
" </svg>\n",
" </button>\n",
"\n",
" <style>\n",
" .colab-df-container {\n",
" display:flex;\n",
" gap: 12px;\n",
" }\n",
"\n",
" .colab-df-convert {\n",
" background-color: #E8F0FE;\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: #1967D2;\n",
" height: 32px;\n",
" padding: 0 0 0 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-convert:hover {\n",
" background-color: #E2EBFA;\n",
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: #174EA6;\n",
" }\n",
"\n",
" .colab-df-buttons div {\n",
" margin-bottom: 4px;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert {\n",
" background-color: #3B4455;\n",
" fill: #D2E3FC;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert:hover {\n",
" background-color: #434B5C;\n",
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
" fill: #FFFFFF;\n",
" }\n",
" </style>\n",
"\n",
" <script>\n",
" const buttonEl =\n",
" document.querySelector('#df-0fdcf617-a70f-4c03-abfa-42f004c47cbb button.colab-df-convert');\n",
" buttonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
"\n",
" async function convertToInteractive(key) {\n",
" const element = document.querySelector('#df-0fdcf617-a70f-4c03-abfa-42f004c47cbb');\n",
" const dataTable =\n",
" await google.colab.kernel.invokeFunction('convertToInteractive',\n",
" [key], {});\n",
" if (!dataTable) return;\n",
"\n",
" const docLinkHtml = 'Like what you see? Visit the ' +\n",
" '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
" + ' to learn more about interactive tables.';\n",
" element.innerHTML = '';\n",
" dataTable['output_type'] = 'display_data';\n",
" await google.colab.output.renderOutput(dataTable, element);\n",
" const docLink = document.createElement('div');\n",
" docLink.innerHTML = docLinkHtml;\n",
" element.appendChild(docLink);\n",
" }\n",
" </script>\n",
" </div>\n",
"\n",
"\n",
"<div id=\"df-fb1389a6-52a7-40ab-9500-63f1d69bece6\">\n",
" <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-fb1389a6-52a7-40ab-9500-63f1d69bece6')\"\n",
" title=\"Suggest charts\"\n",
" style=\"display:none;\">\n",
"\n",
"<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
" width=\"24px\">\n",
" <g>\n",
" <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
" </g>\n",
"</svg>\n",
" </button>\n",
"\n",
"<style>\n",
" .colab-df-quickchart {\n",
" --bg-color: #E8F0FE;\n",
" --fill-color: #1967D2;\n",
" --hover-bg-color: #E2EBFA;\n",
" --hover-fill-color: #174EA6;\n",
" --disabled-fill-color: #AAA;\n",
" --disabled-bg-color: #DDD;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-quickchart {\n",
" --bg-color: #3B4455;\n",
" --fill-color: #D2E3FC;\n",
" --hover-bg-color: #434B5C;\n",
" --hover-fill-color: #FFFFFF;\n",
" --disabled-bg-color: #3B4455;\n",
" --disabled-fill-color: #666;\n",
" }\n",
"\n",
" .colab-df-quickchart {\n",
" background-color: var(--bg-color);\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: var(--fill-color);\n",
" height: 32px;\n",
" padding: 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-quickchart:hover {\n",
" background-color: var(--hover-bg-color);\n",
" box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: var(--button-hover-fill-color);\n",
" }\n",
"\n",
" .colab-df-quickchart-complete:disabled,\n",
" .colab-df-quickchart-complete:disabled:hover {\n",
" background-color: var(--disabled-bg-color);\n",
" fill: var(--disabled-fill-color);\n",
" box-shadow: none;\n",
" }\n",
"\n",
" .colab-df-spinner {\n",
" border: 2px solid var(--fill-color);\n",
" border-color: transparent;\n",
" border-bottom-color: var(--fill-color);\n",
" animation:\n",
" spin 1s steps(1) infinite;\n",
" }\n",
"\n",
" @keyframes spin {\n",
" 0% {\n",
" border-color: transparent;\n",
" border-bottom-color: var(--fill-color);\n",
" border-left-color: var(--fill-color);\n",
" }\n",
" 20% {\n",
" border-color: transparent;\n",
" border-left-color: var(--fill-color);\n",
" border-top-color: var(--fill-color);\n",
" }\n",
" 30% {\n",
" border-color: transparent;\n",
" border-left-color: var(--fill-color);\n",
" border-top-color: var(--fill-color);\n",
" border-right-color: var(--fill-color);\n",
" }\n",
" 40% {\n",
" border-color: transparent;\n",
" border-right-color: var(--fill-color);\n",
" border-top-color: var(--fill-color);\n",
" }\n",
" 60% {\n",
" border-color: transparent;\n",
" border-right-color: var(--fill-color);\n",
" }\n",
" 80% {\n",
" border-color: transparent;\n",
" border-right-color: var(--fill-color);\n",
" border-bottom-color: var(--fill-color);\n",
" }\n",
" 90% {\n",
" border-color: transparent;\n",
" border-bottom-color: var(--fill-color);\n",
" }\n",
" }\n",
"</style>\n",
"\n",
" <script>\n",
" async function quickchart(key) {\n",
" const quickchartButtonEl =\n",
" document.querySelector('#' + key + ' button');\n",
" quickchartButtonEl.disabled = true; // To prevent multiple clicks.\n",
" quickchartButtonEl.classList.add('colab-df-spinner');\n",
" try {\n",
" const charts = await google.colab.kernel.invokeFunction(\n",
" 'suggestCharts', [key], {});\n",
" } catch (error) {\n",
" console.error('Error during call to suggestCharts:', error);\n",
" }\n",
" quickchartButtonEl.classList.remove('colab-df-spinner');\n",
" quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
" }\n",
" (() => {\n",
" let quickchartButtonEl =\n",
" document.querySelector('#df-fb1389a6-52a7-40ab-9500-63f1d69bece6 button');\n",
" quickchartButtonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
" })();\n",
" </script>\n",
"</div>\n",
"\n",
" <div id=\"id_f4f562d2-b940-4ca0-bef0-f611cb27d741\">\n",
" <style>\n",
" .colab-df-generate {\n",
" background-color: #E8F0FE;\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: #1967D2;\n",
" height: 32px;\n",
" padding: 0 0 0 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-generate:hover {\n",
" background-color: #E2EBFA;\n",
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: #174EA6;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-generate {\n",
" background-color: #3B4455;\n",
" fill: #D2E3FC;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-generate:hover {\n",
" background-color: #434B5C;\n",
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
" fill: #FFFFFF;\n",
" }\n",
" </style>\n",
" <button class=\"colab-df-generate\" onclick=\"generateWithVariable('Injuries')\"\n",
" title=\"Generate code using this dataframe.\"\n",
" style=\"display:none;\">\n",
"\n",
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
" width=\"24px\">\n",
" <path d=\"M7,19H8.4L18.45,9,17,7.55,7,17.6ZM5,21V16.75L18.45,3.32a2,2,0,0,1,2.83,0l1.4,1.43a1.91,1.91,0,0,1,.58,1.4,1.91,1.91,0,0,1-.58,1.4L9.25,21ZM18.45,9,17,7.55Zm-12,3A5.31,5.31,0,0,0,4.9,8.1,5.31,5.31,0,0,0,1,6.5,5.31,5.31,0,0,0,4.9,4.9,5.31,5.31,0,0,0,6.5,1,5.31,5.31,0,0,0,8.1,4.9,5.31,5.31,0,0,0,12,6.5,5.46,5.46,0,0,0,6.5,12Z\"/>\n",
" </svg>\n",
" </button>\n",
" <script>\n",
" (() => {\n",
" const buttonEl =\n",
" document.querySelector('#id_f4f562d2-b940-4ca0-bef0-f611cb27d741 button.colab-df-generate');\n",
" buttonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
"\n",
" buttonEl.onclick = () => {\n",
" google.colab.notebook.generateWithVariable('Injuries');\n",
" }\n",
" })();\n",
" </script>\n",
" </div>\n",
"\n",
" </div>\n",
" </div>\n"
],
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "dataframe",
"variable_name": "Injuries",
"summary": "{\n \"name\": \"Injuries\",\n \"rows\": 22,\n \"fields\": [\n {\n \"column\": \"year\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 6,\n \"min\": 2001,\n \"max\": 2022,\n \"num_unique_values\": 22,\n \"samples\": [\n 2001,\n 2014,\n 2009\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"quantity\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 2221,\n \"min\": 2737,\n \"max\": 9254,\n \"num_unique_values\": 22,\n \"samples\": [\n 8896,\n 3375,\n 5250\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}"
}
},
"metadata": {},
"execution_count": 21
}
]
},
{
"cell_type": "code",
"source": [
"type[Injuries]"
],
"metadata": {
"id": "jj2pz5dXJ2wN",
"outputId": "3c8fd9cd-09dd-409f-d260-c45aa9a17896",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"type[ year quantity\n",
"0 2001 8896\n",
"1 2002 9254\n",
"2 2003 8844\n",
"3 2004 7915\n",
"4 2005 6919\n",
"5 2006 6821\n",
"6 2007 7062\n",
"7 2008 6329\n",
"8 2009 5250\n",
"9 2010 4408\n",
"10 2011 4259\n",
"11 2012 3778\n",
"12 2013 3585\n",
"13 2014 3375\n",
"14 2015 3334\n",
"15 2016 3439\n",
"16 2017 3318\n",
"17 2018 3458\n",
"18 2019 3275\n",
"19 2020 2914\n",
"20 2021 2737\n",
"21 2022 2917]"
]
},
"metadata": {},
"execution_count": 22
}
]
},
{
"cell_type": "code",
"source": [
"x= Injuries[[\"year\"]].values\n",
"Y= Injuries[[\"quantity\"]].values"
],
"metadata": {
"id": "klS08wCWMmTV"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"x"
],
"metadata": {
"id": "-2To-gTARXUc",
"outputId": "0035fb03-aa3c-4bdf-80d7-7d24c8084df0",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[2001],\n",
" [2002],\n",
" [2003],\n",
" [2004],\n",
" [2005],\n",
" [2006],\n",
" [2007],\n",
" [2008],\n",
" [2009],\n",
" [2010],\n",
" [2011],\n",
" [2012],\n",
" [2013],\n",
" [2014],\n",
" [2015],\n",
" [2016],\n",
" [2017],\n",
" [2018],\n",
" [2019],\n",
" [2020],\n",
" [2021],\n",
" [2022]])"
]
},
"metadata": {},
"execution_count": 24
}
]
},
{
"cell_type": "code",
"source": [
"Y"
],
"metadata": {
"id": "A_RXK_SXRaU4",
"outputId": "510462c4-b754-4d63-e0f0-26dd9d558305",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[8896],\n",
" [9254],\n",
" [8844],\n",
" [7915],\n",
" [6919],\n",
" [6821],\n",
" [7062],\n",
" [6329],\n",
" [5250],\n",
" [4408],\n",
" [4259],\n",
" [3778],\n",
" [3585],\n",
" [3375],\n",
" [3334],\n",
" [3439],\n",
" [3318],\n",
" [3458],\n",
" [3275],\n",
" [2914],\n",
" [2737],\n",
" [2917]])"
]
},
"metadata": {},
"execution_count": 25
}
]
},
{
"cell_type": "code",
"source": [
"Injuries.plot(kind='scatter', grid=True,\n",
" x=\"year\", y=\"quantity\")\n",
"plt.axis([2000, 2024, 1000, 10000])\n",
"plt.show()"
],
"metadata": {
"id": "6ECRF3jUR3iW",
"outputId": "b9d4d883-50a2-4140-e5c8-919539283641",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 460
}
},
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
],
"image/png": "\n"
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": [
"# prompt: Try with another dataset https://raw.githubusercontent.com/jpandersen61/Machine-Learning/main/InjuredandkilledintrafikDK.csv\n",
"\n",
"# Assuming you want to create a linear regression model for the 'Injuries' dataset.\n",
"# Load the dataset\n",
"datafiles = \"https://raw.githubusercontent.com/jpandersen61/Machine-Learning/main/InjuredandkilledintrafikDK.csv\"\n",
"Injuries = pd.read_csv(datafiles)\n",
"\n",
"# Extract features (X) and labels (y)\n",
"X = Injuries[[\"year\"]].values\n",
"y = Injuries[[\"quantity\"]].values\n",
"\n",
"# Create a linear regression model\n",
"lin_reg = linear_model.LinearRegression()\n",
"lin_reg.fit(X, y)\n",
"\n",
"# Get the model parameters (intercept and slope)\n",
"t0 = lin_reg.intercept_[0]\n",
"t1 = lin_reg.coef_[0][0]\n",
"\n",
"# Print the model parameters\n",
"print(f\"θ0={t0:.2f}, θ1={t1:.2e}\")\n",
"\n",
"# Make predictions\n",
"X_pred = 2025\n",
"y_pred = lin_reg.predict([[X_pred]])[0, 0]\n",
"\n",
"# Plot the data and the regression line\n",
"Injuries.plot(kind='scatter', figsize=(5, 3), grid=True,\n",
" x=\"year\", y=\"quantity\")\n",
"\n",
"X = np.linspace(Injuries['year'].min(), Injuries['year'].max(), 1000)\n",
"plt.plot(X, t0 + t1 * X, \"b\")\n",
"\n",
"plt.text(Injuries['year'].max() - 2, Injuries['quantity'].min() + 1000,\n",
" fr\"$\\theta_0 = {t0:.2f}$\", color=\"b\")\n",
"plt.text(Injuries['year'].max() - 2, Injuries['quantity'].min() + 500,\n",
" fr\"$\\theta_1 = {t1:.2e}$\", color=\"b\")\n",
"\n",
"plt.axis([Injuries['year'].min() - 1, Injuries['year'].max() + 1, Injuries['quantity'].min() - 500, Injuries['quantity'].max() + 500])\n",
"\n",
"plt.plot([X_pred, X_pred],\n",
" [Injuries['quantity'].min(), y_pred], \"r--\")\n",
"plt.plot(X_pred, y_pred, \"ro\")\n",
"\n",
"plt.show()"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 333
},
"id": "TiFoe21gdLGb",
"outputId": "adca9996-f14b-4e70-bacb-c58a4919955a"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"θ0=651608.68, θ1=-3.21e+02\n"
]
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 500x300 with 1 Axes>"
],
"image/png": "\n"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": [
"### Questions"
],
"metadata": {
"id": "xpgWXfOyA23y"
}
},
{
"cell_type": "markdown",
"source": [
"1. How would you define Machine Learning?\n"
],
"metadata": {
"id": "rOIfHyJcA8C8"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here: Is the study of computer algorithms that improve automatically through experience"
],
"metadata": {
"id": "bq3i-aI1BH9Q"
}
},
{
"cell_type": "markdown",
"source": [
"2. Can you name four types of problems where it shines?"
],
"metadata": {
"id": "wMA1G2PPBwpU"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here:\n",
"- Stock prices (daily)\n",
"- Volumes sold ( Weekly)\n",
"- Active users (monthly )\n",
"- Birth/death rates (annually)"
],
"metadata": {
"id": "QAcDslfMBtVj"
}
},
{
"cell_type": "markdown",
"source": [
"3. What is a labeled training set?"
],
"metadata": {
"id": "i0M6EMx8B71g"
}
},
{
"cell_type": "markdown",
"source": [],
"metadata": {
"id": "JSO8YkGRLFBD"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here: Units of information that teacg the machine trends and similarities derived from the data"
],
"metadata": {
"id": "fipL2K_FBtin"
}
},
{
"cell_type": "markdown",
"source": [
"4. What are the two most common supervised tasks?"
],
"metadata": {
"id": "43FAK9O-CE8K"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here:\n",
"- Speed\n",
"- Type of traffic\n",
"- Percentage of sicount\n",
"- Time of day"
],
"metadata": {
"id": "JjakH8_HBtmV"
}
},
{
"cell_type": "markdown",
"source": [
"5. Can you name four common unsupervised tasks?\n",
"\n",
"\n",
"\n"
],
"metadata": {
"id": "6_5YsjSzCJes"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here:\n",
"- Clustering\n",
"- Dimensionality reduction\n",
"- Anomaly reduction\n",
"-Network intrusion detection"
],
"metadata": {
"id": "aPFI1nvxBtp4"
}
},
{
"cell_type": "markdown",
"source": [
"6. What type of Machine Learning algorithm would you use to allow\n",
"a robot to walk in various unknown terrains?\n"
],
"metadata": {
"id": "DwIRcR9YCNb5"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here: Reinforcement learning"
],
"metadata": {
"id": "E8SE6UvrBtuJ"
}
},
{
"cell_type": "markdown",
"source": [
"7. What type of algorithm would you use to segment your customers\n",
"into multiple groups?"
],
"metadata": {
"id": "ZflRQBbbCRwX"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here: Clustering algorytmer"
],
"metadata": {
"id": "7QcbHPkBBtwI"
}
},
{
"cell_type": "markdown",
"source": [
"8. Would you frame the problem of spam detection as a supervised\n",
"learning problem or an unsupervised learning problem?"
],
"metadata": {
"id": "X0fKgGXsLq6o"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here: Supervised"
],
"metadata": {
"id": "nLnPVahlBty_"
}
},
{
"cell_type": "markdown",
"source": [
"9. What is an online learning system?\n"
],
"metadata": {
"id": "n7CHmnsdCeY3"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here: refers to an algorithm that updates its model incrementally as new data arrives, rather than being trained on a fixed dataset all at once (as is done in batch learning). This approach is particularly useful in scenarios where data is continuously generated, and the system needs to adapt in real-time."
],
"metadata": {
"id": "0YRkRS_7Bt2I"
}
},
{
"cell_type": "markdown",
"source": [
"10. What is out-of-core learning?"
],
"metadata": {
"id": "UKkgQ_veCsN_"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here: Out-of-core learning is a technique used in machine learning to handle datasets that are too large to fit into a computer's main memory (RAM). Instead of loading the entire dataset into memory at once, out-of-core learning processes the data in smaller chunks that can be managed within the available memory."
],
"metadata": {
"id": "0raVrt27Bt5U"
}
},
{
"cell_type": "markdown",
"source": [
"11. What type of learning algorithm relies on a similarity measure to\n",
"make predictions?"
],
"metadata": {
"id": "alMcqdhaCt8m"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here: Time series forecasting or K-NN ( a k-nearst neighborg )\n",
"\n"
],
"metadata": {
"id": "NzHnn7Q5Bt8C"
}
},
{
"cell_type": "markdown",
"source": [
"12. What is the difference between a model parameter and a learning\n",
"algorithm’s hyperparameter?"
],
"metadata": {
"id": "lpvdcsj2CznS"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here: model parameters are the values that the model learns from the data, while hyperparameters are the settings that you configure before the learning process begins, and they control how the model learns."
],
"metadata": {
"id": "22JFh-dABt_J"
}
},
{
"cell_type": "markdown",
"source": [
"13. What do model-based learning algorithms search for? What is the\n",
"most common strategy they use to succeed? How do they make\n",
"predictions?"
],
"metadata": {
"id": "fwVjj-WkC4U5"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here: Model-based learning algorithms search for the best model parameters that will allow the model to make accurate predictions on new, unseen data. These parameters define the specific form of the model, such as the weights in a linear regression or the coefficients in a decision tree."
],
"metadata": {
"id": "c8Q85u6RBuCH"
}
},
{
"cell_type": "markdown",
"source": [
"14. Can you name four of the main challenges in Machine Learning?"
],
"metadata": {
"id": "wkKbPfjODAj-"
}
},
{
"cell_type": "markdown",
"source": [
"Write your answer here:\n",
"\n",
"- overfitting/underfitting\n",
"- data quality/quantity\n",
"- bias/fairness\n",
"- interpretability/explainabilit"
],
"metadata": {
"id": "HznUcxBmBuFH"
}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment