Created
February 2, 2020 10:16
-
-
Save depindersharma/f21b0f98d75fe2d1d172c14363ed9e61 to your computer and use it in GitHub Desktop.
Created on Cognitive Class Labs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "cells": [ | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "source": [ | |
| "<a href=\"https://www.bigdatauniversity.com\"><img src=\"https://ibm.box.com/shared/static/cw2c7r3o20w9zn8gkecaeyjhgw3xdgbj.png\" width=\"400\" align=\"center\"></a>\n", | |
| "\n", | |
| "<h1><center>Simple Linear Regression</center></h1>\n", | |
| "\n", | |
| "\n", | |
| "<h4>About this Notebook</h4>\n", | |
| "In this notebook, we learn how to use scikit-learn to implement simple linear regression. We download a dataset that is related to fuel consumption and Carbon dioxide emission of cars. Then, we split our data into training and test sets, create a model using training set, evaluate your model using test set, and finally use model to predict unknown value.\n" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "<h1>Table of contents</h1>\n", | |
| "\n", | |
| "<div class=\"alert alert-block alert-info\" style=\"margin-top: 20px\">\n", | |
| " <ol>\n", | |
| " <li><a href=\"#understanding_data\">Understanding the Data</a></li>\n", | |
| " <li><a href=\"#reading_data\">Reading the data in</a></li>\n", | |
| " <li><a href=\"#data_exploration\">Data Exploration</a></li>\n", | |
| " <li><a href=\"#simple_regression\">Simple Regression Model</a></li>\n", | |
| " </ol>\n", | |
| "</div>\n", | |
| "<br>\n", | |
| "<hr>" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "source": [ | |
| "### Importing Needed packages" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 4, | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "import matplotlib.pyplot as plt\n", | |
| "import pandas as pd\n", | |
| "import pylab as pl\n", | |
| "import numpy as np\n", | |
| "%matplotlib inline" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "source": [ | |
| "### Downloading Data\n", | |
| "To download the data, we will use !wget to download it from IBM Object Storage." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 3, | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "--2020-01-06 13:00:27-- https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/FuelConsumptionCo2.csv\n", | |
| "Resolving s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)... 67.228.254.196\n", | |
| "Connecting to s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)|67.228.254.196|:443... connected.\n", | |
| "HTTP request sent, awaiting response... 200 OK\n", | |
| "Length: 72629 (71K) [text/csv]\n", | |
| "Saving to: ‘FuelConsumption.csv’\n", | |
| "\n", | |
| "FuelConsumption.csv 100%[===================>] 70.93K --.-KB/s in 0.04s \n", | |
| "\n", | |
| "2020-01-06 13:00:27 (1.62 MB/s) - ‘FuelConsumption.csv’ saved [72629/72629]\n", | |
| "\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "!wget -O FuelConsumption.csv https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/FuelConsumptionCo2.csv" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "__Did you know?__ When it comes to Machine Learning, you will likely be working with large datasets. As a business, where can you host your data? IBM is offering a unique opportunity for businesses, with 10 Tb of IBM Cloud Object Storage: [Sign up now for free](http://cocl.us/ML0101EN-IBM-Offer-CC)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "source": [ | |
| "\n", | |
| "<h2 id=\"understanding_data\">Understanding the Data</h2>\n", | |
| "\n", | |
| "### `FuelConsumption.csv`:\n", | |
| "We have downloaded a fuel consumption dataset, **`FuelConsumption.csv`**, which contains model-specific fuel consumption ratings and estimated carbon dioxide emissions for new light-duty vehicles for retail sale in Canada. [Dataset source](http://open.canada.ca/data/en/dataset/98f1a129-f628-4ce4-b24d-6f16bf24dd64)\n", | |
| "\n", | |
| "- **MODELYEAR** e.g. 2014\n", | |
| "- **MAKE** e.g. Acura\n", | |
| "- **MODEL** e.g. ILX\n", | |
| "- **VEHICLE CLASS** e.g. SUV\n", | |
| "- **ENGINE SIZE** e.g. 4.7\n", | |
| "- **CYLINDERS** e.g 6\n", | |
| "- **TRANSMISSION** e.g. A6\n", | |
| "- **FUEL CONSUMPTION in CITY(L/100 km)** e.g. 9.9\n", | |
| "- **FUEL CONSUMPTION in HWY (L/100 km)** e.g. 8.9\n", | |
| "- **FUEL CONSUMPTION COMB (L/100 km)** e.g. 9.2\n", | |
| "- **CO2 EMISSIONS (g/km)** e.g. 182 --> low --> 0\n" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "source": [ | |
| "<h2 id=\"reading_data\">Reading the data in</h2>" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 5, | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/html": [ | |
| "<div>\n", | |
| "<style scoped>\n", | |
| " .dataframe tbody tr th:only-of-type {\n", | |
| " vertical-align: middle;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe tbody tr th {\n", | |
| " vertical-align: top;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe thead th {\n", | |
| " text-align: right;\n", | |
| " }\n", | |
| "</style>\n", | |
| "<table border=\"1\" class=\"dataframe\">\n", | |
| " <thead>\n", | |
| " <tr style=\"text-align: right;\">\n", | |
| " <th></th>\n", | |
| " <th>MODELYEAR</th>\n", | |
| " <th>MAKE</th>\n", | |
| " <th>MODEL</th>\n", | |
| " <th>VEHICLECLASS</th>\n", | |
| " <th>ENGINESIZE</th>\n", | |
| " <th>CYLINDERS</th>\n", | |
| " <th>TRANSMISSION</th>\n", | |
| " <th>FUELTYPE</th>\n", | |
| " <th>FUELCONSUMPTION_CITY</th>\n", | |
| " <th>FUELCONSUMPTION_HWY</th>\n", | |
| " <th>FUELCONSUMPTION_COMB</th>\n", | |
| " <th>FUELCONSUMPTION_COMB_MPG</th>\n", | |
| " <th>CO2EMISSIONS</th>\n", | |
| " </tr>\n", | |
| " </thead>\n", | |
| " <tbody>\n", | |
| " <tr>\n", | |
| " <th>0</th>\n", | |
| " <td>2014</td>\n", | |
| " <td>ACURA</td>\n", | |
| " <td>ILX</td>\n", | |
| " <td>COMPACT</td>\n", | |
| " <td>2.0</td>\n", | |
| " <td>4</td>\n", | |
| " <td>AS5</td>\n", | |
| " <td>Z</td>\n", | |
| " <td>9.9</td>\n", | |
| " <td>6.7</td>\n", | |
| " <td>8.5</td>\n", | |
| " <td>33</td>\n", | |
| " <td>196</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>1</th>\n", | |
| " <td>2014</td>\n", | |
| " <td>ACURA</td>\n", | |
| " <td>ILX</td>\n", | |
| " <td>COMPACT</td>\n", | |
| " <td>2.4</td>\n", | |
| " <td>4</td>\n", | |
| " <td>M6</td>\n", | |
| " <td>Z</td>\n", | |
| " <td>11.2</td>\n", | |
| " <td>7.7</td>\n", | |
| " <td>9.6</td>\n", | |
| " <td>29</td>\n", | |
| " <td>221</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2</th>\n", | |
| " <td>2014</td>\n", | |
| " <td>ACURA</td>\n", | |
| " <td>ILX HYBRID</td>\n", | |
| " <td>COMPACT</td>\n", | |
| " <td>1.5</td>\n", | |
| " <td>4</td>\n", | |
| " <td>AV7</td>\n", | |
| " <td>Z</td>\n", | |
| " <td>6.0</td>\n", | |
| " <td>5.8</td>\n", | |
| " <td>5.9</td>\n", | |
| " <td>48</td>\n", | |
| " <td>136</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>3</th>\n", | |
| " <td>2014</td>\n", | |
| " <td>ACURA</td>\n", | |
| " <td>MDX 4WD</td>\n", | |
| " <td>SUV - SMALL</td>\n", | |
| " <td>3.5</td>\n", | |
| " <td>6</td>\n", | |
| " <td>AS6</td>\n", | |
| " <td>Z</td>\n", | |
| " <td>12.7</td>\n", | |
| " <td>9.1</td>\n", | |
| " <td>11.1</td>\n", | |
| " <td>25</td>\n", | |
| " <td>255</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>4</th>\n", | |
| " <td>2014</td>\n", | |
| " <td>ACURA</td>\n", | |
| " <td>RDX AWD</td>\n", | |
| " <td>SUV - SMALL</td>\n", | |
| " <td>3.5</td>\n", | |
| " <td>6</td>\n", | |
| " <td>AS6</td>\n", | |
| " <td>Z</td>\n", | |
| " <td>12.1</td>\n", | |
| " <td>8.7</td>\n", | |
| " <td>10.6</td>\n", | |
| " <td>27</td>\n", | |
| " <td>244</td>\n", | |
| " </tr>\n", | |
| " </tbody>\n", | |
| "</table>\n", | |
| "</div>" | |
| ], | |
| "text/plain": [ | |
| " MODELYEAR MAKE MODEL VEHICLECLASS ENGINESIZE CYLINDERS \\\n", | |
| "0 2014 ACURA ILX COMPACT 2.0 4 \n", | |
| "1 2014 ACURA ILX COMPACT 2.4 4 \n", | |
| "2 2014 ACURA ILX HYBRID COMPACT 1.5 4 \n", | |
| "3 2014 ACURA MDX 4WD SUV - SMALL 3.5 6 \n", | |
| "4 2014 ACURA RDX AWD SUV - SMALL 3.5 6 \n", | |
| "\n", | |
| " TRANSMISSION FUELTYPE FUELCONSUMPTION_CITY FUELCONSUMPTION_HWY \\\n", | |
| "0 AS5 Z 9.9 6.7 \n", | |
| "1 M6 Z 11.2 7.7 \n", | |
| "2 AV7 Z 6.0 5.8 \n", | |
| "3 AS6 Z 12.7 9.1 \n", | |
| "4 AS6 Z 12.1 8.7 \n", | |
| "\n", | |
| " FUELCONSUMPTION_COMB FUELCONSUMPTION_COMB_MPG CO2EMISSIONS \n", | |
| "0 8.5 33 196 \n", | |
| "1 9.6 29 221 \n", | |
| "2 5.9 48 136 \n", | |
| "3 11.1 25 255 \n", | |
| "4 10.6 27 244 " | |
| ] | |
| }, | |
| "execution_count": 5, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "df = pd.read_csv(\"FuelConsumption.csv\")\n", | |
| "\n", | |
| "# take a look at the dataset\n", | |
| "df.head()\n", | |
| "\n" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "source": [ | |
| "<h2 id=\"data_exploration\">Data Exploration</h2>\n", | |
| "Lets first have a descriptive exploration on our data." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 6, | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/html": [ | |
| "<div>\n", | |
| "<style scoped>\n", | |
| " .dataframe tbody tr th:only-of-type {\n", | |
| " vertical-align: middle;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe tbody tr th {\n", | |
| " vertical-align: top;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe thead th {\n", | |
| " text-align: right;\n", | |
| " }\n", | |
| "</style>\n", | |
| "<table border=\"1\" class=\"dataframe\">\n", | |
| " <thead>\n", | |
| " <tr style=\"text-align: right;\">\n", | |
| " <th></th>\n", | |
| " <th>MODELYEAR</th>\n", | |
| " <th>ENGINESIZE</th>\n", | |
| " <th>CYLINDERS</th>\n", | |
| " <th>FUELCONSUMPTION_CITY</th>\n", | |
| " <th>FUELCONSUMPTION_HWY</th>\n", | |
| " <th>FUELCONSUMPTION_COMB</th>\n", | |
| " <th>FUELCONSUMPTION_COMB_MPG</th>\n", | |
| " <th>CO2EMISSIONS</th>\n", | |
| " </tr>\n", | |
| " </thead>\n", | |
| " <tbody>\n", | |
| " <tr>\n", | |
| " <th>count</th>\n", | |
| " <td>1067.0</td>\n", | |
| " <td>1067.000000</td>\n", | |
| " <td>1067.000000</td>\n", | |
| " <td>1067.000000</td>\n", | |
| " <td>1067.000000</td>\n", | |
| " <td>1067.000000</td>\n", | |
| " <td>1067.000000</td>\n", | |
| " <td>1067.000000</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>mean</th>\n", | |
| " <td>2014.0</td>\n", | |
| " <td>3.346298</td>\n", | |
| " <td>5.794752</td>\n", | |
| " <td>13.296532</td>\n", | |
| " <td>9.474602</td>\n", | |
| " <td>11.580881</td>\n", | |
| " <td>26.441425</td>\n", | |
| " <td>256.228679</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>std</th>\n", | |
| " <td>0.0</td>\n", | |
| " <td>1.415895</td>\n", | |
| " <td>1.797447</td>\n", | |
| " <td>4.101253</td>\n", | |
| " <td>2.794510</td>\n", | |
| " <td>3.485595</td>\n", | |
| " <td>7.468702</td>\n", | |
| " <td>63.372304</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>min</th>\n", | |
| " <td>2014.0</td>\n", | |
| " <td>1.000000</td>\n", | |
| " <td>3.000000</td>\n", | |
| " <td>4.600000</td>\n", | |
| " <td>4.900000</td>\n", | |
| " <td>4.700000</td>\n", | |
| " <td>11.000000</td>\n", | |
| " <td>108.000000</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>25%</th>\n", | |
| " <td>2014.0</td>\n", | |
| " <td>2.000000</td>\n", | |
| " <td>4.000000</td>\n", | |
| " <td>10.250000</td>\n", | |
| " <td>7.500000</td>\n", | |
| " <td>9.000000</td>\n", | |
| " <td>21.000000</td>\n", | |
| " <td>207.000000</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>50%</th>\n", | |
| " <td>2014.0</td>\n", | |
| " <td>3.400000</td>\n", | |
| " <td>6.000000</td>\n", | |
| " <td>12.600000</td>\n", | |
| " <td>8.800000</td>\n", | |
| " <td>10.900000</td>\n", | |
| " <td>26.000000</td>\n", | |
| " <td>251.000000</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>75%</th>\n", | |
| " <td>2014.0</td>\n", | |
| " <td>4.300000</td>\n", | |
| " <td>8.000000</td>\n", | |
| " <td>15.550000</td>\n", | |
| " <td>10.850000</td>\n", | |
| " <td>13.350000</td>\n", | |
| " <td>31.000000</td>\n", | |
| " <td>294.000000</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>max</th>\n", | |
| " <td>2014.0</td>\n", | |
| " <td>8.400000</td>\n", | |
| " <td>12.000000</td>\n", | |
| " <td>30.200000</td>\n", | |
| " <td>20.500000</td>\n", | |
| " <td>25.800000</td>\n", | |
| " <td>60.000000</td>\n", | |
| " <td>488.000000</td>\n", | |
| " </tr>\n", | |
| " </tbody>\n", | |
| "</table>\n", | |
| "</div>" | |
| ], | |
| "text/plain": [ | |
| " MODELYEAR ENGINESIZE CYLINDERS FUELCONSUMPTION_CITY \\\n", | |
| "count 1067.0 1067.000000 1067.000000 1067.000000 \n", | |
| "mean 2014.0 3.346298 5.794752 13.296532 \n", | |
| "std 0.0 1.415895 1.797447 4.101253 \n", | |
| "min 2014.0 1.000000 3.000000 4.600000 \n", | |
| "25% 2014.0 2.000000 4.000000 10.250000 \n", | |
| "50% 2014.0 3.400000 6.000000 12.600000 \n", | |
| "75% 2014.0 4.300000 8.000000 15.550000 \n", | |
| "max 2014.0 8.400000 12.000000 30.200000 \n", | |
| "\n", | |
| " FUELCONSUMPTION_HWY FUELCONSUMPTION_COMB FUELCONSUMPTION_COMB_MPG \\\n", | |
| "count 1067.000000 1067.000000 1067.000000 \n", | |
| "mean 9.474602 11.580881 26.441425 \n", | |
| "std 2.794510 3.485595 7.468702 \n", | |
| "min 4.900000 4.700000 11.000000 \n", | |
| "25% 7.500000 9.000000 21.000000 \n", | |
| "50% 8.800000 10.900000 26.000000 \n", | |
| "75% 10.850000 13.350000 31.000000 \n", | |
| "max 20.500000 25.800000 60.000000 \n", | |
| "\n", | |
| " CO2EMISSIONS \n", | |
| "count 1067.000000 \n", | |
| "mean 256.228679 \n", | |
| "std 63.372304 \n", | |
| "min 108.000000 \n", | |
| "25% 207.000000 \n", | |
| "50% 251.000000 \n", | |
| "75% 294.000000 \n", | |
| "max 488.000000 " | |
| ] | |
| }, | |
| "execution_count": 6, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "# summarize the data\n", | |
| "df.describe()" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "Lets select some features to explore more." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 7, | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/html": [ | |
| "<div>\n", | |
| "<style scoped>\n", | |
| " .dataframe tbody tr th:only-of-type {\n", | |
| " vertical-align: middle;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe tbody tr th {\n", | |
| " vertical-align: top;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe thead th {\n", | |
| " text-align: right;\n", | |
| " }\n", | |
| "</style>\n", | |
| "<table border=\"1\" class=\"dataframe\">\n", | |
| " <thead>\n", | |
| " <tr style=\"text-align: right;\">\n", | |
| " <th></th>\n", | |
| " <th>ENGINESIZE</th>\n", | |
| " <th>CYLINDERS</th>\n", | |
| " <th>FUELCONSUMPTION_COMB</th>\n", | |
| " <th>CO2EMISSIONS</th>\n", | |
| " </tr>\n", | |
| " </thead>\n", | |
| " <tbody>\n", | |
| " <tr>\n", | |
| " <th>0</th>\n", | |
| " <td>2.0</td>\n", | |
| " <td>4</td>\n", | |
| " <td>8.5</td>\n", | |
| " <td>196</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>1</th>\n", | |
| " <td>2.4</td>\n", | |
| " <td>4</td>\n", | |
| " <td>9.6</td>\n", | |
| " <td>221</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2</th>\n", | |
| " <td>1.5</td>\n", | |
| " <td>4</td>\n", | |
| " <td>5.9</td>\n", | |
| " <td>136</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>3</th>\n", | |
| " <td>3.5</td>\n", | |
| " <td>6</td>\n", | |
| " <td>11.1</td>\n", | |
| " <td>255</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>4</th>\n", | |
| " <td>3.5</td>\n", | |
| " <td>6</td>\n", | |
| " <td>10.6</td>\n", | |
| " <td>244</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>5</th>\n", | |
| " <td>3.5</td>\n", | |
| " <td>6</td>\n", | |
| " <td>10.0</td>\n", | |
| " <td>230</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>6</th>\n", | |
| " <td>3.5</td>\n", | |
| " <td>6</td>\n", | |
| " <td>10.1</td>\n", | |
| " <td>232</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7</th>\n", | |
| " <td>3.7</td>\n", | |
| " <td>6</td>\n", | |
| " <td>11.1</td>\n", | |
| " <td>255</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8</th>\n", | |
| " <td>3.7</td>\n", | |
| " <td>6</td>\n", | |
| " <td>11.6</td>\n", | |
| " <td>267</td>\n", | |
| " </tr>\n", | |
| " </tbody>\n", | |
| "</table>\n", | |
| "</div>" | |
| ], | |
| "text/plain": [ | |
| " ENGINESIZE CYLINDERS FUELCONSUMPTION_COMB CO2EMISSIONS\n", | |
| "0 2.0 4 8.5 196\n", | |
| "1 2.4 4 9.6 221\n", | |
| "2 1.5 4 5.9 136\n", | |
| "3 3.5 6 11.1 255\n", | |
| "4 3.5 6 10.6 244\n", | |
| "5 3.5 6 10.0 230\n", | |
| "6 3.5 6 10.1 232\n", | |
| "7 3.7 6 11.1 255\n", | |
| "8 3.7 6 11.6 267" | |
| ] | |
| }, | |
| "execution_count": 7, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "cdf = df[['ENGINESIZE','CYLINDERS','FUELCONSUMPTION_COMB','CO2EMISSIONS']]\n", | |
| "cdf.head(9)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "we can plot each of these features:" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 9, | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX8AAAEICAYAAAC3Y/QeAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3dfZxdVX3v8c+X8PygIY6EAIFBTXkJxlYbEYvVUaE8arAIDSIQC+XWgoJGJXgRpFdatBd8hNsbCyVIBKM8BdTyVKYUroCEohACEiRCJCQCQhhEauB3/1hrcOdkn5kzJ3PmnDP7+369zmvOWftp7T17/846a629tiICMzOrlo3anQEzMxt7Dv5mZhXk4G9mVkEO/mZmFeTgb2ZWQQ7+ZmYV5OBvZlZBlQ7+kj4s6S5JA5JWSvqRpHfmabtLWiTpWUnPSbpZ0p8Vlv0jSVdL+rWkpyVdJ2m3wvQvSPp9Xvfg65nC9JC0StLGhbSNJa2WFIW0fknHFT5/TtIjeX0rJH23MG0PSddL+o2kZyQtlnRgntYnaUXN/h8s6U5Jz0t6StICSTsVps/O+fxMzXIrJPXl9xMlXSjpiXycfi7plCb/JdZh6lwjn8/nbk9hvs0kLZX0PyT15vNm45L1fUHSJYXPIeleSRsV0r4o6aL8fnBdg9fQKknXStq3Zr3LJb1Qc719M0+bLemlnLZG0k8lHVyz/LGSHsjn8CpJP5C0zagdyA5U2eAv6VPAV4F/ACYDOwPnAzMlvR64DbgX2BXYAbgSuF7SO/IqJgKLgN3y8ncCV9ds5rsRsXXhNbFm+jPAAYXPBwK/GSLPxwBHAftExNbADOCmwizXADfk/GwHfAJYU2ddHwK+A3wN6AH2AF4EbpW0bWHWp4FTJL2qTra+AmwNvBF4NfAB4OF6+2DdY4hr5FXAtaRzZ9BpwEpgXhOb2gGYNcw8E/M5/8ekc/xKSbNr5nl/zfV2YmHaj/PyE/M+XCZpIoCkd+d9PCIitiGdywub2I/uEhGVe5GC1ABwWJ3p3wZ+WJL+f4Bb6iwzCQjgNfnzF4BLhshDkC6Y7xXSvg/8z/RveSWtHzguv/8m8NU66+vJ65xYZ3ofsCK/F/BL4LM182wE3Af8ff48G7iV9KVyRmG+FUBffn8fcEi7/6d+je6rgWvk1fk8OAh4E6nQ8vo8rTefixuXLLfOdZHnOwV4aHB+4IvARUOtC/g0sArYKH9eTioUleV1NnBr4fOWeZ1vK6zrqnYf87F+VbXk/w5gc1Jpvsy+wPdK0hcCe0vasmTau4AnIuKpEeTjKuBduepkIvDnrP/roeh24GhJn5E0Q9KEwrSngGXAJZIOkTR5iPXsRirFrbOPEfEycDlp/4s+D3xS0qQ6eTpL0kclTRtim9ZdhrxGIuJZ4GPAPwMXAmdGRLO/+K4g/UKdPcJltiOdyw3L18xHgd+TCkAAdwD7STpT0t6SNhvJOrtVVYP/a4AnI2Jtnek9pJ+wtVaSjlmxWoRcT34e8Kma+Q/Pde+Dr5trpv+OVKr+K9LP3kU5rVREXAJ8HNgP+A9gtaS5eVoA7yGVgM4BVkq6pU5AHqyrrbePPcWEiLgHuJ5UQqv1cWABcCJwv6Rlkg4omc+6y3DXCBFxDenLfyPg6xuwrSAVME4fQeB9PP8tFkiuqrne/qYwba/c5vY74H8DH4mI1Xk//hP4S+CtwA+ApySdW1O4GneqGvyfAnrKGqSyJ4EpJelTgJcp1MtLei0pMJ4fEZfWzL8wIiYWXu8pWefFwNH5dfFwGY+IBRGxD6nu8m+Bv5e0X562IiJOjIjXA7sAz9dZ55OF/SnbxydL0k8HPiZp+5r8vBAR/xARf0oKGAuB79X5lWDdY7hrZNAS4IH8q7FpEfFD4FHg+AYX2TH/fbqQdkjN9fatwrTbI7W5bUsqZP15zfZ/FBHvJ32ZzCT9CjmOcayqwf/HpBLAIXWm3wgcVpJ+OKnh6LcAuWH0emBRRJzVZF7+kxRwJ5Pq1xsSEb+PiO8BPyPVudZOf4z0a2S9acCDpPradfYx97g4lHUbkQfX9wDpp/bnhsjTGlLD2VakhnLrXsNdI61wGqnNq6xatdYHgdWkc7lhETEA/B1wlKS3lEx/OSJuAv6d8mtn3Khk8M/1lacD5+X68S0lbSLpAElfBs4E/kzSWZImSdpG0sdJpfNTAHLvl+uA2yJi7gbkJYD3Ax/I7+vKXdYOyvnZKFev7AHcIWnbXGf5hjytB/hr0s/ysm1+Gjgtd+XbIpfo/4XUk+MrdbJwJqm+9JVeS7nb39skbSppc+AkUi+mEV2U1lkauEYasZmkzQuvIeNNRPSTetgdU28eSZMlnQicAZzazC+O3C73L6T9Q9JMSbPyNSRJewLvpuTaGU8qGfwBIuJcUh39acCvgcdI9dZXRcRDwDtJ3cqWk+rBDwX2i4jb8io+CLwN+GhN3+KdC5v5q5ppA5K2K8nLkohY0kC215BK3o+SAuyXgY9FxK3Af5N6RtyY57uP1HVzdp39/y6p2+gnSdU89wNbAHvXa7SOiEdIPaG2KiYD/5rX8TipsfigXMKyLjbUNdLgKgaAFwqv9zawzGmsW48/6BlJz5O+HA4k9UK6sGaea2qutXodOiB1YT1Q0ptJ1bh/Q+pxtAa4BPiniFjQQH67loYpbJqZ2ThU2ZK/mVmVOfibmVWQg7+ZWQU5+JuZVdBwN3CMiZ6enujt7W13Nko9//zzbLXVVsPPWDGdeFwWL178ZES8tt35aFSnnfed9j/ttPxA5+VpQ875jgj+vb293HXXXe3ORqn+/n76+vranY2O04nHRdIvh5+rc3Taed9p/9NOyw90Xp425Jx3tY+ZWQU5+JuZVZCDv5lZBXVEnX836537gxEvs/zsg1qQE6uyZs5D8LlYZS75m9UhaYKk/5J0bf48SdINkh7Kf7ctzHtqfpbBg4NDbJt1Mgd/s/pOApYWPs8FboqIaaRhr+cCSNqd9DCePYD9gfPH+4NArPs5+JuVyE9nO4g09O+gmcD8/H4+fxjrfiZwWUS8mEc+XQbsOVZ5NWuG6/yzenWmc6avZXaT9anW1b4KfBbYppA2OSJWAkTEysLw3Duy7tjvK/jDk6bWIel48tOqJk+eTH9//6hkds70uk9bHFJx+wMDA6OWn9HQafmBzsxTsxz828CNxJ1N0sHA6ohYLKmvkUVK0krHSo+IecA8gBkzZsRo3TDUbAFl+ZF/2H6n3cDUafmBzsxTsxz8zda3N/ABSQcCmwOvknQJsErSlFzqn0J6jCCkkv7UwvI78YcHjJt1JNf5m9WIiFMjYqeI6CU15P57RHyE9ODvwUcMHgNcnd8vAmZJ2kzSrsA04M4xzrbZiLjkb9a4s4GFko4lPUrzMEiP4ZS0kPQozLXACRHxUvuyaTY8B3+zIeSHivfn908B76sz31nAWWOWMbMN5GofM7MKcvA3M6sgB38zswpy8Dczq6BhG3wlTQUuBrYHXgbmRcTXJE0Cvgv0AsuBwyPiN3mZU4FjgZeAT0TEdS3JvZmNOd+kOD40UvJfC8yJiDcCewEn5IGsPMiVmVmXGjb4R8TKiLg7v3+ONMrhjniQKzOzrjWifv6SeoG3AHewgYNctWqAq2bVGxhr8hbND5o1mtp9fGqNpwGuzKqo4eAvaWvgcuDkiFgjlY1llWYtSVtvkKtWDXDVrHoDY82ZvpZz7m3/vXDFAbg6wXga4Mqsihrq7SNpE1LgXxARV+TkVXlwKzzIlZlZdxk2+CsV8S8AlkbEuYVJHuTKzKxLNVKfsTdwFHCvpHty2ufwIFdmZl1r2OAfEbdSXo8PHuTKzKwr+Q5fM7MKcvA3M6sgB38zswpy8DczqyAHfzOzCnLwNzOrIAd/M7MKcvA3M6sgB38zswpy8DczqyAHfzOzCnLwNzOrIAd/M7MKav8jqszMStz7q2frPmFvKMvPPqgFuRl/XPI3M6sgB38zswpy8DczqyAHfzOzCnLwNzOrIAd/M7MKcvA3M6sgB3+zGpKmSrpZ0lJJSySdlNMnSbpB0kP577aFZU6VtEzSg5L2a1/uzRrj4G+2vrXAnIh4I7AXcIKk3YG5wE0RMQ24KX8mT5sF7AHsD5wvaUJbcm7WIAd/sxoRsTIi7s7vnwOWAjsCM4H5ebb5wCH5/Uzgsoh4MSIeAZYBe45trs1GxsM7mA1BUi/wFuAOYHJErIT0BSFpuzzbjsDthcVW5LSy9R0PHA8wefJk+vv7RyWfc6avbWq54vYHBgYayk8z22pmPydvMXbbalSjx6gbOPib1SFpa+By4OSIWCOp7qwlaVE2Y0TMA+YBzJgxI/r6+kYhpzQ1Bg7A8iP/sP3+/n4ayU9T4+0UttOobyy4mnPuHXmIamZbjWr0GHUDV/uYlZC0CSnwL4iIK3LyKklT8vQpwOqcvgKYWlh8J+DxscqrWTOGDf6SLpS0WtJ9hTT3erBxS6mIfwGwNCLOLUxaBByT3x8DXF1InyVpM0m7AtOAO8cqv2bNaKTkfxGpB0ORez3YeLY3cBTwXkn35NeBwNnAvpIeAvbNn4mIJcBC4H7g34ATIuKl9mTdrDHDVqhFxC250atoJtCX388H+oFTKPR6AB6RNNjr4cejk12z1ouIWymvxwd4X51lzgLOalmmzEZZsw2+HdvroVn1ehU02+NgtLX7+NQaT70ezKpotHv7tL3XQ7Pq9WCYM31tUz0ORlsrezA0Yzz1ejCromaj2ipJU3Kpv+N6PfQ22e2tkzWzT36cnZnV02xXT/d6MDPrYsOW/CVdSmrc7ZG0AjiD1MthoaRjgUeBwyD1epA02OthLe71YGbWkRrp7XNEnUnu9WBm1qXa35JpLdNs24fbCszGPw/vYGZWQQ7+ZmYV5OBvZlZBDv5mZhXk4G9mVkEO/mZmFeTgb2ZWQQ7+ZmYV5OBvZlZBDv5mZhXk4G9mVkEO/mZmFeTgb2ZWQQ7+ZmYV5OBvZlZBHs/f1tPIcwDmTF+7zkPv/QwAs+7ikr+ZWQU5+JuZVZCDv5lZBTn4m5lVkIO/mVkFOfibmVWQg7+ZWQV1fD//Rvqcm5nZyHR88Lfu0MyXtG8MM2sfV/uYmVWQS/5mZmOk2WrsVvxKblnJX9L+kh6UtEzS3FZtx6xT+Jy3btKSkr+kCcB5wL7ACuAnkhZFxP2t2J5Zu3XrOV8sidYO1lcljZbIi8eo29usWlXtsyewLCJ+ASDpMmAm0NEXgtkGGLVz3j3cbCwoIkZ/pdKHgP0j4rj8+Sjg7RFxYmGe44Hj88fdgAdHPSOjowd4st2Z6ECdeFx2iYjXtmPDjZzzOb2Tz/tO+592Wn6g8/LU9DnfqpK/StLW+ZaJiHnAvBZtf9RIuisiZrQ7H53Gx2U9w57z0Nnnfaf9TzstP9CZeWpWqxp8VwBTC593Ah5v0bbMOoHPeesqrQr+PwGmSdpV0qbALGBRi7Zl1gl8zltXaUm1T0SslXQicB0wAbgwIpa0YltjoCN/oncAH5eCcXLOd9r/tNPyA52Zp6a0pMHXzMw6m4d3MDOrIAd/M7MKqnTwlzRV0s2SlkpaIumknD5J0g2SHsp/ty0sc2q+ff9BSfu1L/etJ2mCpP+SdG3+7OPS5SQtl3SvpHsk3VUyXZK+nv+XP5P01hbnZ7ecl8HXGkkn18zTJ+nZwjyntyAfF0paLem+Qlrd871m2e4c1iMiKvsCpgBvze+3AX4O7A58GZib0+cCX8rvdwd+CmwG7Ao8DExo93608Ph8CvgOcG3+7OPS5S9gOdAzxPQDgR+R7lvYC7hjDPM2AXiCdONSMb1v8Bxs4bbfBbwVuK+QVnq+l+T5YeB1wKb5Oti93f/nRl6VLvlHxMqIuDu/fw5YCuxIui1/fp5tPnBIfj8TuCwiXoyIR4BlpNv6xx1JOwEHAf9SSK78camAmcDFkdwOTJQ0ZYy2/T7g4Yj45Rht7xURcQvwdE1yvfO96JVhPSLiv4HBYT06XqWDf5GkXuAtwB3A5IhYCekLAtguz7Yj8FhhsRU5bTz6KvBZ4OVCmo9L9wvgekmL81ATtdr5v5wFXFpn2jsk/VTSjyTtMUb5qXe+F3Xtue/gD0jaGrgcODki1gw1a0nauOsrK+lgYHVELG50kZK0cXdcxom9I+KtwAHACZLeVTO9Lf/LfGPcB4DvlUy+m1QV9MfAN4CrWp2fEejac7/ywV/SJqTAvyAirsjJqwZ/6ua/q3N6VW7h3xv4gKTlpJ+x75V0CT4uXS8iHs9/VwNXsn71XLv+lwcAd0fEqtoJEbEmIgby+x8Cm0jqGYM81Tvfi7r23K908Jck4AJgaUScW5i0CDgmB79HgV0lDQCfB+ZIOk5SADOAOwvrWyGpr/B5mqTLJP0692J4SNI3cn36YC+GFYX5+yX9TtLUQto+OR+Dn5dLekHSQOH1zTxtU0nn5HwMSHpE0ldqlt0nv19Ss44BSS9KejkiTgU+AuwMvJrUkHUIqTH383l1xwBXF47XLEmbSdoVmFY8LtYZJG0laZvB98BfAPfVzLYIODr3+tkLeHaw6qPFjqBOlY+k7fO1iqQ9SXHrqTHI0yLSeQ7rnu9F3TusR7tbnNv5At5J+on2M+Ce/DoQeA1wE/B70k/OSYVl/iepBPAS8CzwqsK0FUBffv8GUgPSucBOOW074GRgVv7cB6woLN9POqnnFdL2AZYXPi8H9qmzP2cA/wHsQPo52gsc3eCyW5MavM8s5o1CT4vCcXko/609Lg+Thig+oN3/20555WP+AjBQeH24+H+v+f8fl99/IZ9/xeWeKcwbwBvqbHMKqVCzEngOeAA4E9iD1Bvlp6ReNU/mvD0N/Bupt5ZID6VZk7dxTGG9b0gh45XPewDXA78BngEWAwfmabOBW+scj33y+4vyNj6Uz/tX5/Sv5vT5+fO38+eXgLWkXnkHA0cWjs0LpPapV45X2TlPKpkvyNt7nlRIOZj0xbMyH/PI535P4Xx/BPhOXscOwA8L6zww5+lh4PT8v3sor385cCHQW5j/4Lzd53M+FpBjROHYBXBuzbE7JKdflD/35s+D+7wKOB/YpKFzs90XRye/ak+cmn/OrcA1wBmF9GLwvwS4Zpj197F+8D+DdMG+IaeNJPhfS2q3GNH+5GmXkS7kjcry5tfonUP1ji3rB/9LhlhvafAHJuVtfmcw4JCqJb4GvDl//kYOTu8gje+1Rw5GVxfWc1EOTNcX0mqD/y+Az5B+GW5Kqi58Z542m8aC/4PA5YXpGwO/IvUYm127LlKp/+PAb1m38FHvmBa3N3hs/hXYHtiC9ItjDfChmmP7FPDhQtoXyUF3mP/3IlKB8W15X14NnAAcm6d/KG/vyLz97UlfDsuBbQv7uywfh40L674iH6+L8ufenNeN8+ftgP9iiBhQfFW62mcUfB74pKRJJdP2IbUljNSvgG+RLv6Ruh34lKS/kzR98KfycCR9gnThfjgiXh5ufutonyIVHj4SEcsBIuKxiDgpIn4maRrwd8CREfHjiFgbaQC6Q4H9Jb23sK75wJslvbt2I7nOfVfgWxHx3/l1W0TcOsL8XgPsXbiBan/SL/EnymbO5+eFpMD5uhFu65OkEvKxEfFERLwQEZcCZwHn1FwvXwbOlNTw4Je5SnVfYGZE/CQf22cj4ryIuCCv/xzgixGxIG//CeC4nK9PFlb3BHAvsF9e9yTgzxiiSilSO84NpPtuhuXgP7yrJD1TeP3N4ISIuIdUWj6lZLkeCiewpBPz8gOSvjXMNv8ReP8QXdrq5ekfgS+RShV3Ab+SdEyddQzmay/gH4DDIqL2CUU71GznmVxXbJ1rH+CKIb7E30cqIa/TJhMRj5EKD/sWkn9LOjfOKlnPU6TS6SWSDpE0ucn8/o7cZpQ/Hw1cXG/mHIwHg+VDI9zWvqRfGbXHZiGpfeuPCmlXkEros0ew/n2AO/OxLLNb3s46PZpyfi5n3WMP6Tgcnd/PIrU5vFhv45J2IH1Z3N5IZh38h3dIREwsvGoD9+nAxyRtX5P+FKnuFYCI+GZETCTVZ24y1AYj4tfAN4G/H0meIuKlXMrYG5hIumgvlPTGspXk0tv3gFMj3dBT6/Ga7UyMiOeHyruVKn5Zj6Sb4uE1X7w3N7DMa0h11/X0DDF9ZZ5e9H+BnSUdUEyMVM/wHlJ1xTnASkm35F8WI3UxqZH51cC7Ke/KuZekZ0gFqiOAD0bEsyPcTr19X1mYPihIv+xPl7RZg+tv5NhTZ56yY38l0JePy1Bfik/mY/MrUjvC9xvJrIP/BoqIB0ilhM/VTLoJ+MsNWPU/kS6uP20yXy9ExHmkxrj1fgZK2ohUL3xbRHxjA/Jpwyt+WR9CarQsKwBsQmpwHLSw5ov3PQ1sa51CR4knh5g+hZrn00bEi8D/yi/VTFsRESdGxOuBXUiBZzBANbqP5Kqi1wKnkToXvFCy3O35GPRExF4RceMQ+1hPvX2fUphezNcPSb39ym6GK9PIsafOPGXH/gXgB6Tj0hMRt9VZb08uWG4J3EZqvB+Wg//oOBP4KKm0PegLwJ9LOlfSjvBKSbu0FF4rIp4hlag+22gmJJ2cu49uIWnjXOWzDakRqNYXSA2BxzW6fhs1jwI9SjcXAq90O94F2NChDW4EPpi/3Mv8OzA1d5l8Re5evBep0FLrX0kNlx+st9Fc1XEe8Kac9CjpF8MrXxiStiQ1Spbt4yXAHIao8hkFNwKHlhybw0l36f68ZJnTSD3Ztmxw/XsOduUu8SCpU8hhxcScn0MpP/YXk47Lt4fbeP6yuIh0N/Sw90E4+A/vmpq+8FfWzhBpPJtvA1sV0n5Ouph2An4q6TnSt/Lj/KGv/HC+Rure1mieXiB9YQx24zsBODQiflGyjtNIDWZPlPT33znPs0PJtEMbzLvVERGPkoYR+ZKkrXO1wmdIpeWG6muzTSVtXnhNIHUtfhUwX9IuAJJ2zIWQN+fz8p+BBZL2Uhq5dQ9SnfONZSXqiFhLKiy80rYlaVtJZ0p6g6SNcrD560L+7yDV58/NedsKOJvUFlUW/L9OqvO+ZQT7P1JfIR2bC5TuHdhc0hGk4P6ZXJW1jojoJzW8Dtl2lue9kdTgeqWkP80FsG0k/a2kv87r/zRwmqQP50La9qTxs16V81frP0jHZdhf5/k8Oop0/Q9/H0QjXYL88suv5l7U7y48ldTeMvhFfR2F0SAp7+c/AGyXp0fJa7Cb6A6kHjFP8Id+/mcAW+bpG5EC+TJSgeExUu+WzQvbv4jUK4XCMvfxSnU/W5F6Ay3P+XqC1Fd+x8Iyu+f9epLUB/37wNR626g5PrdS0tVziOPcxzBdPfPnnXM+nyZVU/2E1DunuMw63WiBt1PoXz9MPjYl1QQsy+v/JSm471yYZ2be7vM5H5fWHJe6+0uhyynr9/N/hvRl8bZGzk0/xtHMrIJc7WNmVkEO/mZmDZJ0ZEk72ICkJe3O20i52sfMrIIavnW5lXp6eqK3t7fd2eD5559nq6265wbWbssvtDbPixcvfjIiXtuSlbdAp5z3Zbrx3Bot3bTvG3LOd0Tw7+3t5a671nuW9Jjr7++nr6+v3dloWLflF1qbZ0mj8vg/SZuTuhxuRrpGvh8RZ+TxVb5L6mWxHDg8In6TlzkVOJbUNfcTEXHdcNvplPO+TDeeW6Olm/Z9Q8551/mbre9F4L2Rnhz1J6QBz/YiPcT7poiYRrohZy6ApN1JY6/sQRqY7Pzc596sYzn4m9WIZCB/3CS/Aj/A3saRjqj2Mes0ueS+mDSG/XkRcYekdR7oLan4APvinbl1H+Kt9ND04wEmT55Mf39/i/ZgwwwMDHRs3lqtKvs+LoN/79wfjHiZ5Wcf1IKcWLeKiJeAP5E0kXS7/puGmL3hh3hHxDxgHsCMGTOi3XXL9a6VOdNf4pxbywdwHe/XSjfV+W8IV/uYDSHSAHv9pLp8P8Dexg0Hf7Makl6bS/xI2oL0kI4HqP9Abz/A3rrOuKz2MdtAU0ijYk4gFZAWRsS1kn4MLJR0LGnI4sMAImKJpIXA/aSROU/I1UZmHcvB36xGRPwMeEtJ+lOkxyCWLXMW5Y87NOtIDv4byI3LZtaNXOdvZlZBDv5mZhXk4G9mVkEO/mZmFeTgb2ZWQQ7+ZmYV5OBvZlZBDv5mZhXk4G9mVkEO/mZmFeTgb2ZWQcMGf0lTJd0saamkJZJOyumTJN0g6aH8d9vCMqdKWibpQUn7tXIHzMxs5BoZ2G0tMCci7pa0DbBY0g3AbNLDrM+WNJf0MOtTah5mvQNwo6Q/8hC3ZuODBzMcH4Yt+UfEyoi4O79/DlhKej6pH2ZtZtalRjSks6Re0jjndwAb9DDrVj7Ies70tSNepr+/v6kHNze7rdHQjQ+a7sY8m41HDQd/SVsDlwMnR8QaqeyZ1WnWkrT1HmbdygdZz27mZ+mRfU09uLnZbY2GbnzQdDfmuRs0UxVj1dZQbx9Jm5AC/4KIuCIn+2HWZmZdqpHePgIuAJZGxLmFSX6YtZlZl2qk2mdv4CjgXkn35LTPAWfjh1mbmXWlYYN/RNxKeT0++GHWZmZdyXf4mtXwjY1WBQ7+ZusbvLHxjcBewAn55sW5pBsbpwE35c/U3Ni4P3C+pAltyblZgxz8zWr4xkarghHd5GVWNaN5Y2NeX0tubmzmZsOhTN5idNfZTTf2VeVGRAd/szpG+8ZGaN3Njc3cbDiUOdPXcs69oxceRuvGxrFQlRsRXe1jVsI3Ntp45+BvVsM3NloVuNrHbH2+sdHGPQd/sxq+sdGqwNU+ZmYV5OBvZlZBrvbJeuf+gDnT1456lzkzs07kkr+ZWQU5+JuZVZCDv5lZBTn4m5lVkIO/mVkFOfibmVWQg7+ZWQW5n3+X6C25/2C4+xKWn31QK7NkZl3MJX8zswpy8DczqyAHfzOzCnLwNzOroI5v8C1r6DQzsw3jkr+ZWQUNG/wlXShptaT7CmmTJN0g6aH8d9vCtFMlLZP0oKT9WpVxMzNrXiMl/4uA/WvS5gI3RcQ04Kb8GUm7A7OAPfIy50uaMGq5NTOzUTFs8I+IW4Cna5JnAvPz+/nAIYX0yyLixYh4BFgG7DlKeTUzs1HSbB1q60YAAAZVSURBVIPv5IhYCRARKyVtl9N3BG4vzLcip61H0vHA8QCTJ0+mv7+/dENzpq9tMosjN3mLsdlevX0dSlm+hstvM9tptYGBgY7Ml7VWMx03fId6a412bx+VpEXZjBExD5gHMGPGjOjr6ytd4Vg+VnHO9LWcc2/rO0AtP7JvxMuUHYfh8tvMdlqtv7+fev/rTiLpQuBgYHVEvCmnTQK+C/QCy4HDI+I3edqpwLHAS8AnIuK6NmTbrGHN9vZZJWkKQP67OqevAKYW5tsJeLz57Jm1zUW4rcvGsWaD/yLgmPz+GODqQvosSZtJ2hWYBty5YVk0G3tu67Lxbtg6DkmXAn1Aj6QVwBnA2cBCSccCjwKHAUTEEkkLgfuBtcAJEfFSi/JuNtbGrK1rpEa7rWqs2r+G0q62oaq0Sw0b/CPiiDqT3ldn/rOAszYkU2ZdZtTbukZqtNvGxqr9ayjtarPqlnapDdXxwzuYdZBVkqbkUr/bulqs2aFd3EuoMR7ewaxxbuuyccMlf7MSbuuy8c7B36yE27psvHO1j5lZBbnk3wZ+RoGZtZtL/mZmFeTgb2ZWQQ7+ZmYV5OBvZlZBbvAdx3yHpJnV45K/mVkFueRv6/FTl8zGP5f8zcwqyMHfzKyCHPzNzCrIwd/MrIIc/M3MKsi9fcxsXHFvtca45G9mVkEu+Zt1GA/5bWPBJX8zswpy8DczqyBX+5hZ5RWr2uZMX8vsBqreur2R2CV/M7MKcsnfzKwJ3d6ltGUlf0n7S3pQ0jJJc1u1HbNO4XPeuklLSv6SJgDnAfsCK4CfSFoUEfe3Yntm7eZz3hrRSQ9YalW1z57Asoj4BYCky4CZgC8Ee0W3/2yu4XPeukqrgv+OwGOFzyuAtxdnkHQ8cHz+OCDpwRblpWGfgB7gyXbno1GdlF99qeFZNyjPw2xnl2bXOwqGPeehM8/7Mp10bo21Ttz3Ic77ps/5VgV/laTFOh8i5gHzWrT9pki6KyJmtDsfjeq2/EJ35rlBw57z0JnnfZlx/H8aVlX2vVUNviuAqYXPOwGPt2hbZp3A57x1lVYF/58A0yTtKmlTYBawqEXbMusEPuetq7Sk2ici1ko6EbgOmABcGBFLWrGtUdbxP8drdFt+oTvzPKwuPufrGZf/pwZVYt8VsV61pJmZjXMe3sHMrIIc/M3MKqhywV/SVEk3S1oqaYmkk0rm6ZP0rKR78uv0duS1kJ/lku7NebmrZLokfT0PK/AzSW9tRz4L+dmtcOzukbRG0sk183TUMa4ySRdKWi3pvkLaJEk3SHoo/922nXlslXrxoAr7X7k6f0lTgCkRcbekbYDFwCHF2/Al9QGfjoiD25TNdUhaDsyIiNIbTyQdCHwcOJB0Y9HXImK9G4zaIQ978Cvg7RHxy0J6Hx10jKtM0ruAAeDiiHhTTvsy8HREnJ3HKdo2Ik5pZz5boV48AGYzzve/ciX/iFgZEXfn988BS0l3Z3azmaQLNyLidmBiPqk7wfuAh4uB3zpLRNwCPF2TPBOYn9/PJwXEcWeIeDDu979ywb9IUi/wFuCOksnvkPRTST+StMeYZmx9AVwvaXEeHqBW2dACnfKFNgu4tM60TjrGtq7JEbESUoAEtmtzflquJh6M+/2v7Hj+krYGLgdOjog1NZPvBnaJiIFcpXIVMG2s81iwd0Q8Lmk74AZJD+TS2qCGhhYYa/lmpw8Ap5ZM7rRjbBVWGw+ksktqfKlkyV/SJqR/9IKIuKJ2ekSsiYiB/P6HwCaSesY4m8X8PJ7/rgauJI0gWdSpQwscANwdEatqJ3TaMbb1rBqsOsx/V7c5Py1TJx6M+/2vXPBX+kq/AFgaEefWmWf7PB+S9iQdp6fGLpfr5GWr3BCFpK2AvwDuq5ltEXB07vWzF/Ds4E/WNjuCOlU+nXSMrdQi4Jj8/hjg6jbmpWWGiAfjfv+r2NvnncB/AvcCL+fkzwE7A0TEP+fb9D8GrAVeAD4VEf+vDdlF0utIpX1I1XTfiYizJP1tIb8CvgnsD/wW+GhErNcldCxJ2pLUDvG6iHg2pxXz3DHHuOokXQr0kYYyXgWcQaqGW0i6Lh4FDouI2kbhrjdEPLiDcb7/lQv+ZmZWwWofMzNz8DczqyQHfzOzCnLwNzOrIAd/M7MKcvA3M6sgB38zswr6/2Vb6j24kZMKAAAAAElFTkSuQmCC\n", | |
| "text/plain": [ | |
| "<Figure size 432x288 with 4 Axes>" | |
| ] | |
| }, | |
| "metadata": { | |
| "needs_background": "light" | |
| }, | |
| "output_type": "display_data" | |
| } | |
| ], | |
| "source": [ | |
| "viz = cdf[['CYLINDERS','ENGINESIZE','CO2EMISSIONS','FUELCONSUMPTION_COMB']]\n", | |
| "viz.hist()\n", | |
| "plt.show()" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "Now, lets plot each of these features vs the Emission, to see how linear is their relation:" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 10, | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEICAYAAACwDehOAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3de7SddX3n8fcnJwEJlxJKoLmRUAztnGNrtGeYaXE51hwKg1RkpmpooNBxJpeNI70qMZ2lY1c6TMcL9kIgVkZKtmJW1YFStSURlnbqkgaMQIKUdEhCAkOCQjXEoST5zh/Ps0+enLMvzz7Zz75+Xmudtff+7efZ+3eenOzv/t2+P0UEZmZmANM6XQEzM+seDgpmZjbOQcHMzMY5KJiZ2TgHBTMzG+egYGZm46YX+eKSdgE/BI4AhyNiVNJZwOeBRcAu4F0R8WJ6/BrgPenx74uIv673+meffXYsWrSoqOqbmfWlhx9++IWImF3tuUKDQuoXI+KFzOObgC0RcbOkm9LHH5A0DCwDRoC5wGZJF0bEkVovvGjRIrZu3Vpk3c3M+o6k3bWe60T30ZXAnen9O4F3ZMrvjohXIuJpYCdwUQfqZ2Y2sIoOCgH8jaSHJa1Iy86NiOcA0ttz0vJ5wDOZc/emZWZm1iZFdx9dHBHPSjoHuF/Sd+scqyplk3JwpMFlBcB5553XmlqamRlQcEshIp5Nb/cDXyLpDnpe0hyA9HZ/evheYEHm9PnAs1Vec0NEjEbE6OzZVcdJzMxsigoLCpJOlXR65T7wS8DjwL3Adelh1wH3pPfvBZZJOlnS+cBi4KGi6mdmZpMV2VI4F/hbSd8h+XD/q4j4KnAzcImkp4BL0sdExHZgE7AD+CpwQ72ZR2Y2OMplWLQIpk1LbsvlTteof6mXU2ePjo6Gp6Sa9bdyGVasgEOHjpXNnAkbNsDy5Z2rVy+T9HBEjFZ7ziuazayrrV17fECA5PHatZ2pT79zUDCzrrZnT3PldmIcFMysq9Waee4Z6cVwUDCzrrZuXTKGkDVzZlJureegYGZdbfnyZFB54UKQklsPMhenHQnxzMxOyPLlDgLt4paCmZmNc1AwM7NxDgpmZjbOQcHMzMY5KJiZ2TgHBTMzG+egYGZm4xwUzMxsnIOCmZmNc1AwM7NxDgpmZjau8KAgaUjStyXdlz7+sKR9kralP5dnjl0jaaekJyVdWnTdzMzseO1IiHcj8ARwRqbsExHx0exBkoaBZcAIMBfYLOlC79NsZtY+hbYUJM0H3gb8WY7DrwTujohXIuJpYCdwUZH1MzOz4xXdfXQL8H7g6ITy90p6VNIdkmalZfOAZzLH7E3LzMysTQoLCpKuAPZHxMMTnloPXAAsAZ4DPlY5pcrLRJXXXSFpq6StBw4caGWVzcwGXpEthYuBt0vaBdwNvFXSxoh4PiKORMRR4FMc6yLaCyzInD8feHbii0bEhogYjYjR2bNnF1h9M7PBU1hQiIg1ETE/IhaRDCB/LSKukTQnc9hVwOPp/XuBZZJOlnQ+sBh4qKj6mZnZZJ3YjvMPJS0h6RraBawEiIjtkjYBO4DDwA2eeWRm1l5tWbwWEQ9GxBXp/Wsj4mci4mcj4u0R8VzmuHURcUFE/FREfKUddTOzfMplWLQIpk1LbsvlTtfIitCJloKZ9ZBSCW6/HY5m5hDu3g0rViT3ly/vTL2sGE5zYWY1jYzA+vXHB4SKQ4dg7dr218mK5aBgZlWVSrBjR/1j9uxpT12sfRwUzKyqDRsaH3PeecXXw9rLQcHMqjrSYO7fzJmwbl176mLt46BgZlUNDdV+7tRTk5aEB5n7j4OCmVVVmV000fAwHDzogNCvHBTMBlypBNOng5TclkpJ+a23wurVx1oMQ0PJ4+3bO1dXK54iJuWc6xmjo6OxdevWTlfDrCeNjcGWLdWfW706CQrWnyQ9HBGj1Z5zS8FsANULCJBv5pH1JwcFswFULyBA45lH1r8cFMwGyNhYMnbQSL2ZR9ZZtcaAWsW5j8wGRKMuo6xaM4+sc0qlJOVI1pEjx8paNQbkloJZn6t8s8wbEDzI3H2qBYSsVo4BuaVg1sfmzYNnJ+1fWN3SpbB5c7H1salp9KHfyjEgBwWzPjU2lj8g9PDM9IHQ6EO/lWNA7j4y61N5u4uWLi22HnbiGn3ot3IMqPCgIGlI0rcl3Zc+PkvS/ZKeSm9nZY5dI2mnpCclXVp03cz6UWWHtDzcZdQban3oS60fA2pHS+FG4InM45uALRGxGNiSPkbSMLAMGAEuA26V5IlxZk0oleDaa5Od0epZvTrpMnJA6A21Uo4cPdr6SQGFBgVJ84G3AX+WKb4SuDO9fyfwjkz53RHxSkQ8DewELiqyfmb9olyGs89OZqg0Gh+YO9ezi3rRrbfC4cPJv+/hw8X9GxbdUrgFeD+Q3czv3Ih4DiC9PSctnwc8kzlub1pmZnVUWgff+17jY5cuhX37iq+T9a7CgoKkK4D9EfFw3lOqlE36ziNphaStkrYeOHDghOpo1uvKZbjttsatg4UL3V1k+RQ5JfVi4O2SLgdeA5whaSPwvKQ5EfGcpDnA/vT4vcCCzPnzgUkT6iJiA7ABkiypBdbfrKuNjDTeQxmSwUjvkGZ5FdZSiIg1ETE/IhaRDCB/LSKuAe4FrksPuw64J71/L7BM0smSzgcWAw8VVT+zXtZMQFi1yhviWH6dWLx2M7BJ0nuAPcA7ASJiu6RNwA7gMHBDRDhXo9kEpVK+gPDjPw6f/KQDgjXHm+yY9YhyGVauhJdfrn9cpXXgGUZWS71NdpzmwqwHlEr5BpQB7rrLrQObOqe5MOtiIyPJN/886w8AhocdENqtsoJ82rTktlzudI1OjFsKZl0q72ByxfAwbN9eXH1ssnI5SUFx6FDyePfuYykpejU4u6Vg1qWaCQirVzsgdMLatccCQsWhQ0l5r3JQMOsypVK+LTMh6bLwpjids2dPc+W9wEHBrIs02mGropId88gRB4ROOu+85sp7gYOCWReobJmZJyBMn57MMHIw6Lx162DmzOPLZs7s7RXkHmg267C8rYOKV18tri7WnMpg8tq1SZfReeclAaFXB5nBi9fMOm5oKMmLn1cP/5e1LlFv8Zq7j8w6aN48B4Ru0m9rDqbCQcGsA8bGksHiZyflAT5eZYetCAeEolXWHOzenVzrypqDQQsMDgpmbTZvHmzZ0vi41auL3WHLjtePaw6mwkHBrE0qKSsatQ4ATj3VwaBolX+Pyk+tfa17ec3BVDgomLWB1NwK5dtvL64ug65cTqb15v336OU1B1PhoGBWsFmzmjt+7tzentLYzSrjBkdy7tTS62sOpsJBwaxAIyPw0kv5jx8ehn37iqvPoKs2bjDRwoVJy27hQtiwYfACtBevmRUkb/4iSFoHDgatVSolH+pHjiSzuFasyDc+sGtX4VXraoW1FCS9RtJDkr4jabuk/5qWf1jSPknb0p/LM+eskbRT0pOSLi2qbmZFaiahHSSzjBwQWquySrzSTXTkSPJ4YkqKiYaHi69btyuypfAK8NaIOChpBvC3kr6SPveJiPho9mBJw8AyYASYC2yWdKH3abZeMjaWb7pphdcetF65XDttyI9+lASGal1I3o8iUVhLIRIH04cz0p96/wWuBO6OiFci4mlgJ3BRUfUza7VyOX9AOPNMB4QiVAaSazl6NOlSyo4bbNyY/Fs4ICQKHWiWNCRpG7AfuD8ivpU+9V5Jj0q6Q1JlbsY84JnM6XvTMrOecM01+Y4bHoYXXyy2LoOikl1WSm5Xrqw/kDw0lAwc79qVBIhduwZvILmRQoNCRByJiCXAfOAiSa8D1gMXAEuA54CPpYdX64Wd9F1K0gpJWyVtPXDgQEE1N8uv2TEEfyNtjZNOmjxu8PLL9c+p14qwRFumpEbES8CDwGUR8XwaLI4Cn+JYF9FeYEHmtPnApLWfEbEhIkYjYnT27NkF19ysvrGx/GmvTznFXUatUAnCzaYQ9w51+RQ5+2i2pDPT+6cAY8B3Jc3JHHYV8Hh6/15gmaSTJZ0PLAYeKqp+Zidi5szkgynPGMLSpUkwaDQ/3hprdu8JSP6tNm50QMiryNlHc4A7JQ2RBJ9NEXGfpLskLSHpGtoFrASIiO2SNgE7gMPADZ55ZN2oma4igM2bi6nHINqwofEx06bBggX9s+lNuxUWFCLiUeANVcqvrXPOOmDAFpVbrxgZaS5/ESRdRtY6edJTrFzpVsGJcJoLsxxmzpxaQHCX0dRMnFVUKiXlQ0P1z5sxwwHhRDkomDUwNpYsesrLYwgnptZq5FKp/uyhM8+Ef/7n9tSxnzkomNXRzII0SAY0PYYwNZWtMGsNJG/YkLQCVq8+1mLI7kzntR+tocgxR07SbOA/AYvIjENExH8orGY5jI6OxtatWztZBetjzY4heLrp1FVWIjdqXfkat4akhyNitNpzeQea7wG+AWwGPCPI+t6sWflTXk+blj8/v1WXJ6V1o/EEa428QWFmRHyg0JqYdYmxsXwBwQPJrZMnpbVXI7dH3jGF+7Iprs2KVulfnjYtuS2X2/O+IyP5F6Q5ILROvS0vK+MGnlXUHnmDwo0kgeH/Sfph+vODIitmg6vSv7x7d9KHvHt38rjowJB32qkHk6em1jRTSBaYTdzroLIS+fBhB4R2yhUUIuL0iJgWEa9J758eEWcUXTkbTNX6lw8dSsqLMGtW8kGVZ9rp8LBXxzarXIbTTqs9zRSSazoxpfUgboXZDXLNPgKQ9HbgzenDByPivsJqlZNnH/WnadOqzzKRknTHrdRMyoozz/S0x2aVSnDbbbVnDQ0NJS0Ba696s49ytRQk3UzShbQj/bkxLTNruVr9y/X6nZs1b15zAWHpUgeEvLLdROvX159G6llb3SfvmMLlwCURcUdE3AFclpaZtVyt/uV1LcqKNXMmPDspKXttw8MeQ8hr4mrkRjzNtPs0s6L5zMz9H2t1RcwqiuxfLpWaS1lxyineFKcZebKYZnmaaffJu6L5auBm4AGSHdLeDKyJiLuLrV59HlOwZjXbZeQWQnPyXt9p05zNtJNOeEVzRHxO0oPAvyQJCh+IiP/buiqaFavZPRCcTmFqhobqdx1JsGqVg0E3q9t9JOmn09s3kmyasxd4Bpiblpl1PQeE9qnXHbRwIdx1lwNCt2vUUvgtYAXwsSrPBfDWltfIrEXmzcs/oOyUFa1R+cDfsCFpMQwNJYHCgaB35F6n0PQLS68Bvg6cTBJ8/iIiPiTpLODzJBlXdwHviogX03PWAO8hSbr3voj463rv4TEFq6WZgABuHdhgacU6hXdKOj29/3uSvihp0labE7wCvDUiXg8sAS6T9K+Bm4AtEbEY2JI+RtIwsAwYIZnyemu6v7NZ05oJCHPnFlcPs16Td0rqf4mIH0p6E3ApcCdwW70TInEwfTgj/QngyvR80tt3pPevBO6OiFci4mlgJ3BR7t/EjCShXTNjCHPnwr59xdXHrNfkDQqV+QRvA9ZHxD3ASY1OkjQkaRuwH7g/Ir4FnBsRzwGkt+ekh88jGcSu2JuWmeUiNb8pjgOC2fHyBoV9km4H3gV8WdLJec6NiCMRsQSYD1wk6XV1Dq/2/W5ST6+kFZK2Stp64MCBnNW3dmtn6mupudbBjBkeQ5jopJOOXUcpeWyDKW9QeBfw18BlEfEScBbwu3nfJD3nQZKxguclzQFIb/enh+0FFmROmw9M6hmOiA0RMRoRo7Nnz85bBWujdqa+nsp0U2/ufryTToJXXz2+7NVXHRgGVd6gMAf4q4h4StJbgHcCD9U7QdJsSWem908BxoDvAvcC16WHXUey1Sdp+TJJJ0s6H1jc6D2sO7Ur9XWzAWHGjNa+fy/LtuQmBoSKWuXW3/IGhS8ARyS9Fvg0cD7w2QbnzAEekPQo8PckYwr3kaTLuETSU8Al6WMiYjuwiSQL61eBGyLCORR7UK2tFfNsuZhHsxlOIQkIbiEkuZ+mTYNrrjnWkjPLyrtH89GIOCzp3wG3RMQfS/p2vRMi4lFg0rTViPgesLTGOeuAFuXCtHYql5OWwJ49tTeyb0Xq62bXH4A/+CoqGUzN6skbFF5Nk+L9GvDLaZkb4wYcG0OodBlVCwitSn3tgNC8sbF8+05P5O62wZS3++jXgZ8H1kXE02mf/8biqmW9pNoYAiQpDlqV+rrZGUYRDghwYgHB3W2DqbA0F+3gNBedVSody3FTTau2z3RCu6nLe+1mzvSeyINkymkuJG1Kbx+T9Gjm57F0ANkGVJ4dtloxhtDszlwOCMdvh5nHaac5INgxjcYUbkxvryi6ItY7yuXGA5atGEM46aT8LY1BDwZT6SZyBlOrpm5QyKSj2A0g6YxG51h/K5XgtrpZr5IxhHXrTnwMIa9BHxCdSkDwrnJWS64PeEkrgY8AP+JY6okAfrKgelkXKpeTgFDvW/nQEOzaNfX3GBpqbhzCA6IOCNZaeb/1/w4wEhEvFFkZ625r1zbupjmRjdg9oJxfo0H+iYaG4PDhYutk/SFvUPhHwPtSDbhGK5JXr556/3QzAaHW4rh+NzLSXBbYrBMJ1jZY8gaFNcDfSfoWyeY5AETE+wqplXWNPCuVpWTv3amMIThdRT5TWcldsXSpB5Mtv7xB4Xbga8BjQAtmnlsvyLNSWYJVq9oTEGAwA0K5PLWA0IoBfxs8eYPC4Yj4rUJrYl2n3krlo0eTdQhT/dAplZo7fhC7jJrtLvK4gbVC3qDwgKQVwF9yfPfR9wuplXWFWmMIR4+e2EplDyg3NpXxA48bWCvkDQq/mt6uyZR5SmqfO++8JL1ytfKpckCobqo5iiqGhz1uYK2RKyFeRJxf5ccBoc+tW5esTM6a6krlZhPanXKKA0IeQ0PJrK/t21tbJxtcjXIfvT9z/50TnvuDoipl3WH58mQu/MKFJ5btdCqtg2pjGf2oXG4+ICxcCBs3Jtfp8GG3EKy16mZJlfRIRLxx4v1qjzvBWVJ7g7uMJiuV4Pbbmx+bGR52q8BO3JSzpAKqcb/a44lvukDSA5KekLRd0o1p+Ycl7ZO0Lf25PHPOGkk7JT0p6dIGdbMu12yXEQxOQFi/3gHBulOjgeaocb/a44kOA78dEY9IOh14WNL96XOfiIiPZg+WNAwsA0aAucBmSRd6n+be5GBwPCets17RKCi8XtIPSFoFp6T3SR+/pt6JaYbVSpbVH0p6AphX55Qrgbsj4hXgaUk7gYuAbzb+NaybOCAczwHBeknd7qOIGIqIMyLi9IiYnt6vPM6dsFjSIuANwLfSovemm/XcIWlWWjYPeCZz2l7qBxGbgnIZFi1KFoMtWpQ8bhVvmVldMwFh5sxkENkBwTol7x7NUybpNOALwG9ExA+A9cAFwBKSlsTHKodWOX3SR4akFZK2Stp64MCBgmrdnyppK3bvTj6Md+9OHrciMEwlZYUd79RTvQOadV6hQUHSDJKAUI6ILwJExPMRcSQijgKfIukigqRlsCBz+nxgUsaXiNgQEaMRMTp79uwiq993qqWtOHQoKZ+qqQwmw2C0EPKqrDU4eNABwTqvsKAgScCngSci4uOZ8jmZw64CHk/v3wssk3SypPOBxcBDRdVvENVKW9EoJXYtUw0GgxYQli6t/dzq1V5rYN2lyK01LwauBR6TtC0t+yBwtaQlJF1Du4CVABGxXdImYAfJzKUbPPOotYpIW9GMQQsGFZs3Vx9sPpH9J8yKUnfxWrfz4rXmTEyFDcnAZrP92N4D4fidz4aGkuvqD3jrFSeyeM36SCvSVkxlumk/BoT164+l8j5yJHncbDpws27kloLl5vUHienTq+/t4P0MrFfUaykUOaZgA6pfg0FFrc1+Bm0TIOtP7j7qI6VS8i1WSm5b1Z0x1Wmnva5UOva7V37GxpIWQTW1ys16iVsKfWLi7JZKPzec2ADooHYZVcYNJtqyBebOrb5nsnc+s37glkIfqJeTf8OGqb3moKesqHfdnn02mU5aaRlUFp959pH1AweFPrBqVe3nptLPPaitg2xeqEbX7dZbk0Flb3Rj/cbdRz2uVErSI9TSTD/3II4bQHINb7utf4Kb2YlwUOhxjbqH8vZzTzUg9PoHaa2xg3rqpa0w63UOCj2uXjfH0qXFdWv0ejCoaHbMxfscWL9zUOhxQ0PVA4OU78Nr0DOcNho7WLgQdu1qS1XMuoIHmntItQ1yanUP1Rt8rpjKgHI/BQSoP+YycyasW9e+uph1AweFHlFrg5yLL57a9MhmAsKMGf0XDCpqBdXTTvOGNzaYnPuoRyxaVD3tdbPdG4M63bQeZzy1QVMv95GDQo+YNq36B7QER4/mew2nvDYzcOrsvlBrI5yiNshxQDAbTA4KPWLdumTgMyvvQOisWc2nrHBAMBtMRe7RvEDSA5KekLRd0o1p+VmS7pf0VHo7K3POGkk7JT0p6dKi6taLprpBjgQvvZT/fXq4N9HMWqCwMQVJc4A5EfGIpNOBh4F3ANcD34+ImyXdBMyKiA9IGgY+B1wEzAU2AxfW26d5kMYUmjXo6w/MrLaOjClExHMR8Uh6/4fAE8A84ErgzvSwO0kCBWn53RHxSkQ8DewkCRDWJK8/MLOpasuYgqRFwBuAbwHnRsRzkAQO4Jz0sHnAM5nT9qZlA6EVG+QM6mY4ZtY6hQcFSacBXwB+IyJ+UO/QKmWTvr9KWiFpq6StBw4caFU1O2pk5MQ3gu/nhHbVVnKbWTEKDQqSZpAEhHJEfDEtfj4db6iMO+xPy/cCCzKnzwcm7W8VERsiYjQiRmfPnl1c5dugVEo+6HbsqP78VDfIyaNXuoxqreR2YDArRpGzjwR8GngiIj6eeepe4Lr0/nXAPZnyZZJOlnQ+sBh4qKj6dVolZXO9D+a8G+T08yrltWvh0KHjyw4dSsrNrPWKzJJ6MXAt8JikbWnZB4GbgU2S3gPsAd4JEBHbJW0CdgCHgRvqzTzqdXlaAY02yOnnYFCxZ09z5WZ2YgoLChHxt1QfJwCouk1JRKwD+jov5dhY7f2UJ6q3Qc4gBARIVmxXy/lU1Epus0HnFc1tNDKSPyAMD1dPytbsDKNeGTuo5URWcptZ8xwU2qRcrj2gPNHq1bB9++TyQZxuOtWV3GY2Nd55rQ3y7ANcRMrmXm4hZC1f7iBg1i4OCgXLuzH84cO1n3PKCjNrFweFguWZZbS06rB7YlAGlM2sO3hMoWCN1hoMD8PmzZPLmx1QPuUUBwQzO3EOCgWrt9Zg48bWDChHTF7gZWY2FQ4KBau11mD16uqDp82mb5gxo/k6mZnV4qBQsFtvTQJApcUwNJQ8rjXL6Jpr8r92N22Z6aR1Zv2hsE122qGfNtk56SR49dX8x3fTP1slaV22C2vmTK8nMOtWHdlkx/KTeisgTNz7YdUqJ60z6xcOCh00MtJ7KSsq6y6yez8cPFj9WCetM+s9XqfQIUNDcPRo/uPrrWVol3I530K8CietM+s9bim02bx5SeugmYBw5pnV1zK0U2XcIC8nrTPrTQ4KbTRvHjw7aS+5+pYuhRdfLKY+zai22U1WJVmdk9aZ9TZ3H7VRswGh0+MHWY3GB1atam0yPzPrDLcU2qRUyn/s3LndFRCg/vhAvXUXZtZbityj+Q5J+yU9nin7sKR9kralP5dnnlsjaaekJyVdWlS92q0yfTPPAO20aUkw2Lev+Ho1q9ZmNxs3OiCY9ZMiWwqfAS6rUv6JiFiS/nwZQNIwsAwYSc+5VVKDHYq738Tpm/XMnZvvuE7xZjdmg6HIPZq/LmlRzsOvBO6OiFeApyXtBC4CvllQ9doiT9ps6L6uolq82Y1Z/+vEmMJ7JT2adi/NSsvmAc9kjtmblvWksbHk23Seb/6rVxdfHzOzvNodFNYDFwBLgOeAj6Xl1db1Vv3+LGmFpK2Sth44cKCYWp6AWbNgy5bGxzVKjGdm1gltDQoR8XxEHImIo8CnSLqIIGkZLMgcOh+oOoEzIjZExGhEjM6ePbvYCjehVEpaBy+91PjY1auT7TcdEMys27Q1KEiak3l4FVCZmXQvsEzSyZLOBxYDD7Wzbici7z7Mbh2YWbcrbKBZ0ueAtwBnS9oLfAh4i6QlJF1Du4CVABGxXdImYAdwGLghIrp4Lk5iZAR27Mh//OHDxdXFzKwVipx9dHWV4k/XOX4d0DPZcppNWdENCe3MzBrxiuYpKJebCwjdkNDOzCwPB4UpaGbzmG5JaGdmloeDQhMq+xDv3l3/uMqAcoRbCGbWW5wlNadSCW67rfHq4+Fh2L69PXUyM2s1txRyKJfzBYS5cx0QzKy3OSg0UCrBNdfUDwgLFybZQrsxu6mZWTPcfVTH2FjjlBULF8KuXW2pjplZ4dxSqKFcbhwQJO9DbGb9xUGhhkbTTqVkC0qnkjazfuLuoxoa7Ul8110OCGbWf9xSqKHRnsQOCGbWjxwUaqi2JzEkK5Sd5dTM+pWDQg3V9iTeuNErlM2svw1sUCiVYPr05AN/+vTk8UTLlyfTTY8eTW7dZWRm/W4gB5onrj84cuTYJjnuGjKzQTZwLYV66w82bGhvXczMus3ABYV66w+OdP1eb2ZmxSosKEi6Q9J+SY9nys6SdL+kp9LbWZnn1kjaKelJSZcWVa966w+Ghop6VzOz3lBkS+EzwGUTym4CtkTEYmBL+hhJw8AyYCQ951ZJhXxE11t/sGJFEe9oZtY7CgsKEfF14PsTiq8E7kzv3wm8I1N+d0S8EhFPAzuBi4qol9cfmJnV1u4xhXMj4jmA9PactHwe8EzmuL1pWct5/YGZWW3dMiVVVcqq7mAgaQWwAuC8en1BdSxf7jUHZmbVtLul8LykOQDp7f60fC+wIHPcfODZai8QERsiYjQiRmfPnl1oZc3MBk27g8K9wHXp/euAezLlyySdLOl8YDHwUJvrZmY28ArrPpL0OeAtwNmS9gIfAm4GNkl6D7AHeCdARGyXtAnYARwGbogIrxowM2uzwoJCRFxd46mlNY5fB3gfMzOzDhq4Fc1mZlabIqpO8ukJkg4Au1v4kmcDL7Tw9fqNr099vj71+frU187rszAiqs7U6emg0GqStkbEaKfr0a18ferz9anP11AR/Q0AAAdoSURBVKe+brk+7j4yM7NxDgpmZjbOQeF43lGhPl+f+nx96vP1qa8rro/HFMzMbJxbCmZmNs5BISVpl6THJG2TtLXT9em0ZjdJGjQ1rs+HJe1L/4a2Sbq8k3XsJEkLJD0g6QlJ2yXdmJb7b4i616fjf0PuPkpJ2gWMRoTnUQOS3gwcBP48Il6Xlv0h8P2IuFnSTcCsiPhAJ+vZKTWuz4eBgxHx0U7WrRukCS/nRMQjkk4HHibZP+V6/DdU7/q8iw7/DbmlYFU1uUnSwKlxfSwVEc9FxCPp/R8CT5DskeK/Iepen45zUDgmgL+R9HC6Z4NNVmuTJDvmvZIeTbuXBrJrZCJJi4A3AN/Cf0OTTLg+0OG/IQeFYy6OiDcC/xa4Ie0eMGvGeuACYAnwHPCxzlan8ySdBnwB+I2I+EGn69Ntqlyfjv8NOSikIuLZ9HY/8CUK2iO6x9XaJMmAiHg+Io5ExFHgUwz435CkGSQfeOWI+GJa7L+hVLXr0w1/Qw4KgKRT08EeJJ0K/BLweP2zBlKtTZKM8Q+5iqsY4L8hSQI+DTwRER/PPOW/IWpfn274G/LsI0DST5K0DiDZY+Kz6f4OAyu7SRLwPMkmSf8L2AScR7pJUkQM5GBrjevzFpJmfwC7gJWV/vNBI+lNwDeAx4CjafEHSfrNB/5vqM71uZoO/w05KJiZ2Th3H5mZ2TgHBTMzG+egYGZm4xwUzMxsnIOCmZmNc1AwM7NxDgrWUpKOZNL+bpO0SNL1kv5kwnEPShpN72fTlm+T9Edp+Wck/UqV97hQ0pcl7UxTD2+SdG763JskPSTpu+nPisx5H5Z0SNI5mbKDmftr0zTGj6b1+FeZ+p2dOe4tku5L718vKSQtzTx/VVr2K5nf9UlJ35H0vyX9lKQvpe+xU9I/ZX73X5hwbX5M0p9L+sf0588l/Vj63KL0ff5z5r3/RNL1Df6Nfie9No+ndfq1tPwkSbek7/OUpHskzc+cF5LuyjyeLunAhGtxIP09tkv6C0kz69XFuo+DgrXajyJiSeZnV87zfjFzzvtqHSTpNcBfAesj4rUR8S9I8sXMlvQTwGeBVRHx08CbgJWS3pZ5iReA367yuj8PXAG8MSJ+FhgDnslZ98dIFh1VLAO+M+GY5RHxepLMoP8jIq6KiCXAfwS+kfnd/27CeZ8G/k9EXBARFwBPA3+WeX4/cKOkk/JUVNIq4BLgojTl95sBpU//AXA6cGFELCZZrPjFdPUtwMvA6ySdkj6+BNg34S0+n/4eI8A/A+/OUy/rHg4K1mt+FfhmRPxlpSAiHoiIx4EbgM9kUhK/ALwfuClz/h3AuyWdNeF15wAvRMQrlXMr+bBy+AZwkaQZaYKz1wLbahz79fT5hiS9Fvg54PczxR8BRiVdkD4+AGzhWOqIRj4IlCrJ6SLinyLizvQb/a8DvxkRR9Ln/ifwCvDWzPlfASpB9mrgczXqPh04FXgxZ72sSzgoWKudkukK+VLjw8c9kDnvN+sc9zqSDUmqGany3Na0vOIgSWC4ccJxfwMskPQPkm6V9G+aqHsAm4FLSfYLuLfOsb9M0rLIYxjYVvmQBkjvb+P43+lm4LclDdV7MSX5vU6PiH+s8vRrgT1VMplOvH53A8vSFtvPcizdc8W7JW0jaUGcBfwl1lMcFKzVst1HV6VltXKpZMuz3UefmOJ7q8Z7TSz7I+A6SWeMHxBxkORb+QqSb9+fz/TN53nNu0m6jZZR/dtzOf2wvBj4nfq/xrhav89x5RHxNPAQSStqKq/XzHs9CiwiaSV8ucrxn0+7xX6CJPj9boM6WZdxULB2+B4wcbOQs0j695u1neTDu9ZzoxPKfg7YkS2IiJdIxh5KE8qPRMSDEfEh4L3Av0+fmlj/SXWPiIdIWjFnR8Q/VKnb8jTgvSMi8o5VbAfeIGn8/2l6//UkO3Vl/QHwAer8n05bAS8rSQA50U5gYdqayHojE64fSUvoo9ToOkrfK0haCd6XpMc4KFg7/D1wcToQTDqz5mTyD+RmfRb4hezgsaTLJP0M8KfA9ZKWpOU/Dvx34A+rvM7HgZUkWXFJZwQtzjy/BNid3n8QuDY9bgi4BnigymuuIemzb4mI2Al8G/i9TPHvAY+kz2WP/S7Jh/cVDV72vwF/WmklSTpD0oqIeJlkEPzjlW6odFbSTOBrE17jDuAjEdGoG+xNQLWuKuti0ztdAet/EfG8pBuBL6ffdA8CV6cbiVQ8IKnSd/5oRPxaev92Sbek95+JiJ+XdAVwS1r+KvAocGP6PtcAn0q/8Qq4JTsonanTC+mYR2X84jTgjyWdCRwm+eZcmc76+8B6Sd9JX/OrwMYqr/mVpi9OY+9J67Uzfe9vpmXVrCMJIvWsJ/ld/17SqyTXr7K71xqSFsA/SDoKfBe4KiakUo6IvcAna7z+u5WkhZ4G7AWub1Af6zJOnW1mZuPcfWRmZuPcfWTWhyT9KclMp6xPpmsPzGpy95GZmY1z95GZmY1zUDAzs3EOCmZmNs5BwczMxjkomJnZuP8PgjC7itdQiLQAAAAASUVORK5CYII=\n", | |
| "text/plain": [ | |
| "<Figure size 432x288 with 1 Axes>" | |
| ] | |
| }, | |
| "metadata": { | |
| "needs_background": "light" | |
| }, | |
| "output_type": "display_data" | |
| } | |
| ], | |
| "source": [ | |
| "plt.scatter(cdf.FUELCONSUMPTION_COMB, cdf.CO2EMISSIONS, color='blue')\n", | |
| "plt.xlabel(\"FUELCONSUMPTION_COMB\")\n", | |
| "plt.ylabel(\"Emission\")\n", | |
| "plt.show()" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": { | |
| "button": false, | |
| "collapsed": true, | |
| "deletable": true, | |
| "jupyter": { | |
| "outputs_hidden": true | |
| }, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| }, | |
| "scrolled": true | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "plt.scatter(cdf.ENGINESIZE, cdf.CO2EMISSIONS, color='blue')\n", | |
| "plt.xlabel(\"Engine size\")\n", | |
| "plt.ylabel(\"Emission\")\n", | |
| "plt.show()" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## Practice\n", | |
| "plot __CYLINDER__ vs the Emission, to see how linear is their relation:" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 11, | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEHCAYAAABBW1qbAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAdC0lEQVR4nO3dfZBddZ3n8feHbhKJ4vDUsCEh6QwG3AQ1undSsuy4SsLCIku0arXitpqaoaZZGhd0ZlbJZqZkHzJSOzhI7W6HaYUlNfaYSfmwpBhESQfKh3XADkQkQYbMJoQmkbQPMyJhoul8949z+vbt7tude5N77rnd5/OqunXO73t+5/SXS5Jvn6ffTxGBmZkZwGl5J2BmZq3DRcHMzMpcFMzMrMxFwczMylwUzMyszEXBzMzK2rM8uKT9wCvACHAsIkqSzgH+CugE9gMfjIifp/3XAzek/W+JiG9Md/zzzjsvOjs7s0rfzGxW2rlz508ioqPatkyLQuo9EfGTivZtwEBE3CHptrT9KUnLgLXAcuBCYLukSyJiZKoDd3Z2Mjg4mGXuZmazjqQXptqWx+WjNcDmdH0z8L6K+JaIOBoR+4C9wMoc8jMzK6ysi0IA35S0U1J3GrsgIg4BpMvz0/gC4MWKfYfSmJmZNUnWl4+uiIiDks4HHpH0o2n6qkps0hgcaXHpBli0aFFjsjQzMyDjM4WIOJguDwNfI7kc9LKk+QDp8nDafQi4qGL3hcDBKsfsi4hSRJQ6OqreJzEzs5OUWVGQ9HpJZ46uA/8KeAbYBqxLu60DHkjXtwFrJc2VtARYCjyRVX5mZjZZlmcKFwDfkfQDkn/c/zoiHgbuAK6S9DxwVdomInYDW4E9wMPAzdM9eWRmxdDfD52dcNppybK/P++MZjfN5KGzS6VS+JFUs9mrvx+6u+HIkbHYvHnQ1wddXfnlNdNJ2hkRpWrb/EazmbWsDRvGFwRI2hs25JNPEbgomFnLOnCgvridOhcFM2tZUz117qfRs+OiYGYta+PG5B5CpXnzkrhlw0XBzFpWV1dyU3nxYpCSpW8yZ6sZA+KZmZ20ri4XgWbymYKZmZW5KJiZWZmLgpmZlbkomJlZmYuCmZmVuSiYmVmZi4KZmZW5KJiZWZmLgpmZlbkomJlZmYuCmZmVZV4UJLVJekrSg2n7dkkvSdqVfq6t6Lte0l5Jz0m6OuvczMxsvGYMiHcr8CzwxorYXRFxZ2UnScuAtcBy4EJgu6RLPE+zmVnzZHqmIGkh8F7gCzV0XwNsiYijEbEP2AuszDI/MzMbL+vLR58DPgkcnxD/mKSnJd0n6ew0tgB4saLPUBozM7MmyawoSLoOOBwROyds2gRcDKwADgGfHd2lymGiynG7JQ1KGhweHm5kymZmhZflmcIVwPWS9gNbgCslfTEiXo6IkYg4DnyesUtEQ8BFFfsvBA5OPGhE9EVEKSJKHR0dGaZvZlY8mRWFiFgfEQsjopPkBvKOiPiwpPkV3d4PPJOubwPWSporaQmwFHgiq/zMzGyyPKbj/O+SVpBcGtoP3AgQEbslbQX2AMeAm/3kkZlZczXl5bWIeCwirkvXPxIRb4mIt0bE9RFxqKLfxoi4OCIujYivNyM3s1bU3w+dnXDaacmyvz/vjKwo8jhTMLNp9PfDRz8Kx9Nn9l54IWmDJ7C37HmYC7MWc+ONYwVh1PHjSdwsay4KZi3m1Vfri5s1kouCmZmVuSiYmVmZi4KZmZW5KJi1mJtuqi9u1kguCmYtprcXVq0aH1u1KombZf0Oi4uCWYvp74cdO8bHduzwC2yW/Bno7k7eXYlIlt3djf2zoYhJA5HOGKVSKQYHB/NOw6yh5s6FX/1qcnzOHDh6tPn5WOvo7EwKwUSLF8P+/bUfR9LOiChV2+YzBbMWU60gTBe34jhwoL74yXBRMDObIRYtqi9+MlwUzMxmiI0bYd688bF585J4o7gomFlL6+mB9naQkmVPT94Z5aerC/r6knsIUrLs62vsQIkeJdXMWlZPD2zaNNYeGRlrF/UR3a6ubEfL9ZmCmbWsvr764nbqXBTMrGWNTDH34lRxO3WZFwVJbZKekvRg2j5H0iOSnk+XZ1f0XS9pr6TnJF2ddW5m1tra2uqL26lrxpnCrcCzFe3bgIGIWAoMpG0kLQPWAsuBa4BeSf5fb1Zg3d31xe3UZVoUJC0E3gt8oSK8Bticrm8G3lcR3xIRRyNiH7AXWJllfmataO7c+uKz2RVXJE8cVWpvT+KWjazPFD4HfBKonFzwgog4BJAuz0/jC4AXK/oNpTGzQplqKIsiDnGxYQMcOzY+duxYErdsZFYUJF0HHI6InbXuUiU2aWAmSd2SBiUNDg8Pn1KOZq3otCn+Vk4Vn82aMayDjZflH7MrgOsl7Qe2AFdK+iLwsqT5AOnycNp/CLioYv+FwMGJB42IvogoRUSpo6Mjw/TN8nH8eH3x2awZwzrMNMuXJy+ujX6WL2/s8TMrChGxPiIWRkQnyQ3kHRHxYWAbsC7ttg54IF3fBqyVNFfSEmAp8ERW+ZlZ62vGsA4zyfLlsGfP+NiePY0tDHmckN4BXCXpeeCqtE1E7Aa2AnuAh4GbI8JPI1vhnHtuffHZrKsLLr98fOzyy7N9o7eVTSwIJ4qfjKYUhYh4LCKuS9d/GhGrImJpuvxZRb+NEXFxRFwaEV9vRm5mrebuuyffPzjttCReND09MDAwPjYwUOzxj7JWwFtXZq1Pmr5dFB7movlcFMxazK23Th7GYWQkiReNh7kYb9my+uInw0XBrMX89Kf1xWczD3Mx3u7dcMYZ42NnnJHEG8VFwcxaloe5GG/1anjttfGx115L4o3i+RTMrGWNzpnQ15dcMmprSwpCUedSmHjT/UTxk+GiYGYtrbe3uEUgD758ZGZmZS4KZmYzxKpV9cVPhouCmdkMsX375AKwalUSbxQXBTOzGeSSS8YeyW1rS9qN5BvNZmYzRE8PbNo01h4ZGWs36ma8zxTMzGaIZgz74aJgZi2tvx86O5NBATs7k3ZRNWPYD18+MrOW1d+fvKx25EjSfuGFsbeZizh89mmnVZ9sqZGz8vlMwcxa1oYNYwVh1JEjxZ2juRmz8rkomFnL8hzNzeeiYGYty3M0N19mRUHS6yQ9IekHknZL+s9p/HZJL0nalX6urdhnvaS9kp6TdHVWuZnZzPCmN9UXt1OX5Y3mo8CVEfFLSacD35E0OsXmXRFxZ2VnScuAtcBy4EJgu6RLPE+zWXE99lh9cTt1mZ0pROKXafP09BPT7LIG2BIRRyNiH7AXWJlVfmbW+jzz2njNmHQo03sKktok7QIOA49ExOPppo9JelrSfZLOTmMLgBcrdh9KY1YAPT3Q3p7MRdze7onZLeGZ18ZrxqRDmRaFiBiJiBXAQmClpMuATcDFwArgEPDZtHu1qcknnVlI6pY0KGlweHg4o8ytmUZf3R/97W/01X0XBvPMa+P19sJNN40f++immxo734Qiprui08AfJH0aeLXyXoKkTuDBiLhM0nqAiPhMuu0bwO0R8b2pjlkqlWJwcDDTvC177e3VLwe0tcGxY83PJ2+q9utRqkl/XVtKT49nXms0STsjolRtW5ZPH3VIOitdPwNYDfxI0vyKbu8HnknXtwFrJc2VtARYCjyRVX7WOnzd2KbT25v8chCRLF0QspXl00fzgc2S2kiKz9aIeFDSX0haQXJpaD9wI0BE7Ja0FdgDHANu9pNHxdDWNvWZgpk1V2ZFISKeBt5eJf6RafbZCGzMKidrTd3d44cDroybWXP5jWbLXW9v9dmkfJnArPlcFCx3/f3wvQmPE3zve8UeItksLy4KljuPhGnWOmq6pyCpA/g9oLNyn4j43WzSsiLxSJhmraPWG80PAN8GtgN+IsgaatGiZPKUanEza65ai8K8iPhUpplYYV17bfWnj669dnLMzLJV6z2FByuHuDZrpIceqi9uZtmptSjcSlIY/lHSK+nnF1kmZsVR7dLRdHEzy05Nl48i4sysEzEzs/zV/EazpOuBd6XNxyLiwWxSMjOzvNR0+UjSHSSXkPakn1vTmJmZzSK1nilcC6yIiOMAkjYDTwG3ZZWYmZk1Xz1vNJ9Vsf4bjU7EzMzyV+uZwmeApyQ9SjJD2ruA9ZllZWZmuaj16aMvSXoM+C2SovCpiPhxlomZmVnzTXv5SNKb0+U7SCbNGQJeBC5MY2ZmNouc6Ezh94Fu4LNVtgVwZcMzMjOz3ExbFCKiO12+p94DS3od8C1gbvpzvhwRn5Z0DvBXJCOu7gc+GBE/T/dZD9xAMujeLRHxjXp/rpmZnbxa31P4gKQz0/U/kvRVSZOm2pzgKHBlRLwNWAFcI+mdJI+xDkTEUmAgbSNpGbAWWA5cA/Sm8zubmVmT1PpI6h9HxCuS/gVwNbAZuGe6HSLxy7R5evoJYE26P+nyfen6GmBLRByNiH3AXmBlzf8lZmZ2ymotCqNzKLwX2BQRDwBzTrSTpDZJu4DDwCMR8ThwQUQcAkiX56fdF5DcxB41lMbMzKxJai0KL0n6c+CDwEOS5tayb0SMRMQKYCGwUtJl03RXtUNM6iR1SxqUNDg8PFxj+mZmVotai8IHgW8A10TE3wPnAP+x1h+S7vMYyb2ClyXNB0iXh9NuQ8BFFbstBA5WOVZfRJQiotTR0VFrCmZmVoNai8J84K8j4nlJ7wY+ADwx3Q6SOiSdla6fAawGfgRsA9al3daRTPVJGl8raa6kJcDSE/0MMzNrrFqLwleAEUlvAu4FlgB/eYJ95gOPSnoa+D7JPYUHgTuAqyQ9D1yVtomI3cBWklFYHwZujohZPR/06tUgjX1Wr847IzMrOkVMumw/uZP0ZES8Q9Ingdci4n9IeioiTvRYaqZKpVIMDg7mmcJJW70aBgYmx1etgu3bm59PnlTtblKqhj+es46/D8uapJ0RUaq2rdYzhV9L+hDwUWB0cp3TG5FcUVUrCNPFzcyaodai8DvA5cDGiNiXXvP/YnZpmZlZHmodJXUPcEtFex/pvQAzM5s9pi0KkrZGxAcl/ZDx7wyI5KXlt2aanZmZNdWJzhRuTZfXZZ2ImZnl70SjpI4OR/ECgKQ3nmgfq01bG4xUeeC2zUMAmlmOah0l9UZJLwNPAzvTz8x8FrRFdHfXFzcza4Zaf+v/Q2B5RPwky2TMzCxftT6S+nfAkSwTKZp7phh4fKq4mVkz1HqmsB74v5IeJ5k8B4CIuGXqXWw6U72Z6jdWzSxPtRaFPwd2AD8EjmeXjpmZ5anWonAsIn4/00zMzCx3td5TeDSd3Ga+pHNGP5lmZmZmTVfrmcK/S5frK2IB/GZj0zEzszzVOvbRkqwTMTOz/E17+SidP2F0/QMTtv1JVkmZmVk+TnRPYW3F+voJ265pcC5mZpazExUFTbFerT1+o3SRpEclPStpt6Rb0/jtkl6StCv9XFuxz3pJeyU9J+nquv5LzMzslJ3onkJMsV6tPdEx4A8i4klJZwI7JT2SbrsrIu6s7CxpGcmZyXLgQmC7pEtm+zzNZhOdey789KfV42ZZO9GZwtsk/ULSK8Bb0/XR9lum2zEiDkXEk+n6K8CzwIJpdlkDbImIo+kkPnuBlTX/l5jNEnffDXPmjI/NmZPEzbI2bVGIiLaIeGNEnBkR7en6aLvmOZoldQJvBx5PQx+T9LSk+ySdncYWAC9W7DbE9EXEbFbq6oIbbhgbRr2tLWl3deWblxVDrS+vnTRJbwC+Anw8In4BbAIuBlYAh4DPjnatsvukS1TpS3SDkgaHh4czytosP/39sHnz2HwbIyNJu78/37ysGDItCpJOJykI/RHxVYCIeDkiRiLiOPB5xi4RDQEXVey+EDg48ZgR0RcRpYgodXR0ZJm+WS42bIAjE8YkPnIkiZtlLbOiIEnAvcCzEfFnFfH5Fd3eDzyTrm8D1kqaK2kJsBR4Iqv8zFrVgQP1xc0aKcupNa8APgL8UNKuNPafgA9JWkFyaWg/cCNAROyWtBXYQ/Lk0s1+8siKaNEieOGF6nGzrGVWFCLiO1S/T/DQNPtsBDZmlZPZTLBxYzIta+UlpHnzkrhZ1jK/0Wxm9enqgr4+WLwYpGTZ1+enj6w5XBQKrqcH2tuTf3za25O25a+rC/bvh+PHk6ULgjVLlvcUrMX19MCmTWPtkZGxdm9vPjmZWb58plBgfX31xc1s9nNRKLCRKZ7tmipuZrOfi0KBjQ6jUGvczGY/F4UCu/TS+uJmNvu5KBTYnj31xc1s9nNRMDOzMhcFMzMrc1EwM7MyFwUzMytzUTAzszIXBTMzK3NRKLBzz60vbmazn4tCgd19N8yZMz42Z04SN7NiclEosK4uuO++8eP233efh2k2K7Is52i+SNKjkp6VtFvSrWn8HEmPSHo+XZ5dsc96SXslPSfp6qxyszGtMG7/xLOVE8XNLDtZnikcA/4gIv4p8E7gZknLgNuAgYhYCgykbdJta4HlwDVAryQPzVYAN9xQX9zMspNZUYiIQxHxZLr+CvAssABYA2xOu20G3peurwG2RMTRiNgH7AVWZpWftY6Hppi1e6q4mWWnKfcUJHUCbwceBy6IiEOQFA7g/LTbAuDFit2G0pjNcgcO1Bc3s+xkXhQkvQH4CvDxiPjFdF2rxKLK8bolDUoaHB4eblSalqNFi+qLm1l2Mi0Kkk4nKQj9EfHVNPyypPnp9vnA4TQ+BFxUsftC4ODEY0ZEX0SUIqLU0dGRXfLWNBs3wrx542Pz5iVxM2uuLJ8+EnAv8GxE/FnFpm3AunR9HfBARXytpLmSlgBLgSeyys9aR1cXXH75+Njll/vRWLM8ZHmmcAXwEeBKSbvSz7XAHcBVkp4HrkrbRMRuYCuwB3gYuDkiPFtwAfT0wMDA+NjAQBI3s+ZSxKTL9jNGqVSKwcHBvNM4Kap2ByU1g/+XnBR/F2bNJWlnRJSqbfMbzWZmVuaiYGZmZS4KZmZW5qJguVu1qr64mWXHRcFyt3375AKwalUSN7PmclGwlnDJJdCWDn/Y1pa0zaz52vNOwKynBzZtGmuPjIy1e3vzycmsqHymYLnr66svbmbZcVGw3I1M8d76VHEzy46LgpmZlbkomJlZmYuC5W7x4vriZpYdFwXLnedTMGsdLgqWu66u5EmjxYuTEVMXL07ank/BrPn8noK1hK4uFwGzVuAzBTMzK3NRMDOzsiznaL5P0mFJz1TEbpf00oTpOUe3rZe0V9Jzkq7OKi8zM5talmcK9wPXVInfFREr0s9DAJKWAWuB5ek+vZLaMszNzMyqyKwoRMS3gJ/V2H0NsCUijkbEPmAvsDKr3MzMrLo87il8TNLT6eWls9PYAuDFij5DaSwTq1cnjz6OflavzuonmZnNLM0uCpuAi4EVwCHgs2lcVfpGtQNI6pY0KGlweHi47gRWr4aBgfGxgYHmF4a2KS6OTRU3M2uGphaFiHg5IkYi4jjwecYuEQ0BF1V0XQgcnOIYfRFRiohSR0dH3TlMLAgnimflggvqi5uZNUNTi4Kk+RXN9wOjTyZtA9ZKmitpCbAUeKKZuTXbwaolb+q4mVkzZPZGs6QvAe8GzpM0BHwaeLekFSSXhvYDNwJExG5JW4E9wDHg5ojwaPpmZk2WWVGIiA9VCd87Tf+NQOZDoC1bBnv2VI+bmRVd4d5ofvXV+uJmZkVSuKJw4EB9cTOzIilcUVi0qL64mVmRFK4otMqELn5PwcxaUeGKQqtM6HLGGfXFzcyaoZCT7LTChC6//GV9cTOzZijcmYKZmU3NRcHMzMpcFMzMrMxFwczMylwUcvKGN9QXNzNrBheFnNxzD7RPeParvT2Jm5nlxUUhJ11dcP/949+XuP/+/B+VNbNic1HI0Xe/C0NDEJEsv/vdvDMys6Ir5MtrraCnBzZtGmuPjIy1e3vzycnMzGcKOZnq3oHvKZhZnlwUchJRX9zMrBkyKwqS7pN0WNIzFbFzJD0i6fl0eXbFtvWS9kp6TtLVWeVlZmZTy/JM4X7gmgmx24CBiFgKDKRtJC0D1gLL0316Jc3qQaT9noKZtaLMikJEfAv42YTwGmBzur4ZeF9FfEtEHI2IfcBeYGVWubUCv6dgZq2o2fcULoiIQwDp8vw0vgB4saLfUBqbtfyegpm1olZ5JFVVYlVvuUrqBroBFs3wOTRbYV4HM7NKzT5TeFnSfIB0eTiNDwEXVfRbCBysdoCI6IuIUkSUOjo6Mk3WzKxoml0UtgHr0vV1wAMV8bWS5kpaAiwFnmhybmZmhZfZ5SNJXwLeDZwnaQj4NHAHsFXSDcAB4AMAEbFb0lZgD3AMuDkiRrLKzczMqsusKETEh6bYtGqK/huBjVnlY2ZmJ+Y3ms3MrEwxg8dVkDQMvJB3Hg1wHvCTvJNoEf4uxvP3McbfxXin8n0sjoiqT+rM6KIwW0gajIhS3nm0An8X4/n7GOPvYrysvg9fPjIzszIXBTMzK3NRaA19eSfQQvxdjOfvY4y/i/Ey+T58T8HMzMp8pmBmZmUuCjmT1CbpKUkP5p1L3iSdJenLkn4k6VlJl+edU14kfULSbknPSPqSpNflnVMz1TtJ12w2xXfxp+nfk6clfU3SWY36eS4K+bsVeDbvJFrE3cDDEfFm4G0U9HuRtAC4BShFxGVAG8kkVEVyPzVO0lUA9zP5u3gEuCwi3gr8LbC+UT/MRSFHkhYC7wW+kHcueZP0RuBdwL0AEfGriPj7fLPKVTtwhqR2YB5TjBo8W9U5SdesVu27iIhvRsSxtPk3JCNLN4SLQr4+B3wSOJ53Ii3gN4Fh4H+nl9O+IOn1eSeVh4h4CbiTZNDIQ8A/RMQ3882qJUw1SVfR/S7w9UYdzEUhJ5KuAw5HxM68c2kR7cA7gE0R8XbgVYpzeWCc9Fr5GmAJcCHwekkfzjcra0WSNpCMLN3fqGO6KOTnCuB6SfuBLcCVkr6Yb0q5GgKGIuLxtP1lkiJRRKuBfRExHBG/Br4K/POcc2oFU03SVUiS1gHXAV3RwHcLXBRyEhHrI2JhRHSS3ETcERGF/W0wIn4MvCjp0jS0imR+jSI6ALxT0jxJIvkuCnnTfYKpJukqHEnXAJ8Cro+II408dqvM0WwG8B+AfklzgP8H/E7O+eQiIh6X9GXgSZJLA09RsLd565mka7ab4rtYD8wFHkl+b+BvIuLfN+Tn+Y1mMzMb5ctHZmZW5qJgZmZlLgpmZlbmomBmZmUuCmZmVuaiYIUi6Z9I2iLp7yTtkbRD0nFJb6no80lJ90jqrByZsmL7/ZL+bbr+mKTBim0lSY+l6++W9A/psB3PSfpW+ib7aN/bJb0kaVfF56wJ+/1I0p0V+1wg6UFJP0jzfyijr8oKyu8pWGGkL4J9DdgcEWvT2Arg3wC9kt5FMqzEjUAJ+I0aD32+pH8dEdXGn/l2RFxX8bP+j6TXImIg3X5XRNxZuUP63Pm3I+I6SWcAT0n6WkR8F/gvwCMRcXfa9601fwFmNfCZghXJe4BfR8Q9o4GI2BUR/5Vk4LmPAncBt0fEz+s47p8Cf3SiThGxi+Qf9Y/VeuCIeA3YBSxIQ/NJhgQZ3f50HXmanZCLghXJZcBUAxB+HNgIdETEX9R53O8BRyW9p4a+TwJvrmh/ouLS0aMTO6eD4y0FvpWG/hdwr6RHJW2QdGGduZpNy0XBDIiIg8AOYNNJHuK/UcPZAqAJ7bsiYkX6qSwqvy3paeDHwIPp2FBExDdIhhn/PElxeUpSx0nmbDaJi4IVyW7gn02z/TgnObdFROwAXge88wRd305tg9t9O51V6y3ATen9iNGf9bOI+MuI+AjwfZLJicwawkXBimQHMFfS740GJP2WpH/ZoONvJJk0qar0pvAfk1wCqklE/C3wGZIRMZF0paR56fqZwMUkg8OZNYSfPrLCiIiQ9H7gc5JuA/4R2E9yP2Eql6YjU476xDTHf0jS8ITwb0t6imRKzcPALRVPHkFyT6FyyPRqU0zeA/yhpCUkZzr/U9Ixkl/qvhAR358mf7O6eJRUMzMr8+UjMzMrc1EwM7MyFwUzMytzUTAzszIXBTMzK3NRMDOzMhcFMzMrc1EwM7Oy/w/SqRJS2Ndk1wAAAABJRU5ErkJggg==\n", | |
| "text/plain": [ | |
| "<Figure size 432x288 with 1 Axes>" | |
| ] | |
| }, | |
| "metadata": { | |
| "needs_background": "light" | |
| }, | |
| "output_type": "display_data" | |
| } | |
| ], | |
| "source": [ | |
| "plt.scatter(cdf.CYLINDERS, cdf.CO2EMISSIONS, color='blue')\n", | |
| "plt.xlabel(\"CYLINDERS\")\n", | |
| "plt.ylabel(\"Emission\")\n", | |
| "plt.show()\n" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "Double-click __here__ for the solution.\n", | |
| "\n", | |
| "<!-- Your answer is below:\n", | |
| " \n", | |
| "plt.scatter(cdf.CYLINDERS, cdf.CO2EMISSIONS, color='blue')\n", | |
| "plt.xlabel(\"Cylinders\")\n", | |
| "plt.ylabel(\"Emission\")\n", | |
| "plt.show()\n", | |
| "\n", | |
| "-->" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "source": [ | |
| "#### Creating train and test dataset\n", | |
| "Train/Test Split involves splitting the dataset into training and testing sets respectively, which are mutually exclusive. After which, you train with the training set and test with the testing set. \n", | |
| "This will provide a more accurate evaluation on out-of-sample accuracy because the testing dataset is not part of the dataset that have been used to train the data. It is more realistic for real world problems.\n", | |
| "\n", | |
| "This means that we know the outcome of each data point in this dataset, making it great to test with! And since this data has not been used to train the model, the model has no knowledge of the outcome of these data points. So, in essence, it is truly an out-of-sample testing.\n", | |
| "\n", | |
| "Lets split our dataset into train and test sets, 80% of the entire data for training, and the 20% for testing. We create a mask to select random rows using __np.random.rand()__ function: " | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": { | |
| "button": false, | |
| "collapsed": true, | |
| "deletable": true, | |
| "jupyter": { | |
| "outputs_hidden": true | |
| }, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "msk = np.random.rand(len(df)) < 0.8\n", | |
| "train = cdf[msk]\n", | |
| "test = cdf[~msk]" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "source": [ | |
| "<h2 id=\"simple_regression\">Simple Regression Model</h2>\n", | |
| "Linear Regression fits a linear model with coefficients $\\theta = (\\theta_1, ..., \\theta_n)$ to minimize the 'residual sum of squares' between the independent x in the dataset, and the dependent y by the linear approximation. " | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "source": [ | |
| "#### Train data distribution" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 12, | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "ename": "NameError", | |
| "evalue": "name 'train' is not defined", | |
| "output_type": "error", | |
| "traceback": [ | |
| "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |
| "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", | |
| "\u001b[0;32m<ipython-input-12-c6ba1ff911b5>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mscatter\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtrain\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mENGINESIZE\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtrain\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mCO2EMISSIONS\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcolor\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'blue'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mxlabel\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Engine size\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mylabel\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Emission\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", | |
| "\u001b[0;31mNameError\u001b[0m: name 'train' is not defined" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "plt.scatter(train.ENGINESIZE, train.CO2EMISSIONS, color='blue')\n", | |
| "plt.xlabel(\"Engine size\")\n", | |
| "plt.ylabel(\"Emission\")\n", | |
| "plt.show()" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "source": [ | |
| "#### Modeling\n", | |
| "Using sklearn package to model data." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 13, | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "ename": "NameError", | |
| "evalue": "name 'train' is not defined", | |
| "output_type": "error", | |
| "traceback": [ | |
| "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |
| "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", | |
| "\u001b[0;32m<ipython-input-13-d6fc4367c750>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0msklearn\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mlinear_model\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mregr\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mlinear_model\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mLinearRegression\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mtrain_x\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0masanyarray\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtrain\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'ENGINESIZE'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4\u001b[0m \u001b[0mtrain_y\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0masanyarray\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtrain\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'CO2EMISSIONS'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mregr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfit\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mtrain_x\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtrain_y\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", | |
| "\u001b[0;31mNameError\u001b[0m: name 'train' is not defined" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "from sklearn import linear_model\n", | |
| "regr = linear_model.LinearRegression()\n", | |
| "train_x = np.asanyarray(train[['ENGINESIZE']])\n", | |
| "train_y = np.asanyarray(train[['CO2EMISSIONS']])\n", | |
| "regr.fit (train_x, train_y)\n", | |
| "# The coefficients\n", | |
| "print ('Coefficients: ', regr.coef_)\n", | |
| "print ('Intercept: ',regr.intercept_)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "As mentioned before, __Coefficient__ and __Intercept__ in the simple linear regression, are the parameters of the fit line. \n", | |
| "Given that it is a simple linear regression, with only 2 parameters, and knowing that the parameters are the intercept and slope of the line, sklearn can estimate them directly from our data. \n", | |
| "Notice that all of the data must be available to traverse and calculate the parameters.\n" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "source": [ | |
| "#### Plot outputs" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "we can plot the fit line over the data:" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 14, | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "ename": "NameError", | |
| "evalue": "name 'train' is not defined", | |
| "output_type": "error", | |
| "traceback": [ | |
| "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |
| "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", | |
| "\u001b[0;32m<ipython-input-14-ea27f013dd36>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mscatter\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtrain\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mENGINESIZE\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtrain\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mCO2EMISSIONS\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcolor\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'blue'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mplot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtrain_x\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mregr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcoef_\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0mtrain_x\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mregr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mintercept_\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'-r'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mxlabel\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Engine size\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mylabel\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Emission\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", | |
| "\u001b[0;31mNameError\u001b[0m: name 'train' is not defined" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "plt.scatter(train.ENGINESIZE, train.CO2EMISSIONS, color='blue')\n", | |
| "plt.plot(train_x, regr.coef_[0][0]*train_x + regr.intercept_[0], '-r')\n", | |
| "plt.xlabel(\"Engine size\")\n", | |
| "plt.ylabel(\"Emission\")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "source": [ | |
| "#### Evaluation\n", | |
| "we compare the actual values and predicted values to calculate the accuracy of a regression model. Evaluation metrics provide a key role in the development of a model, as it provides insight to areas that require improvement.\n", | |
| "\n", | |
| "There are different model evaluation metrics, lets use MSE here to calculate the accuracy of our model based on the test set: \n", | |
| "<ul>\n", | |
| " <li> Mean absolute error: It is the mean of the absolute value of the errors. This is the easiest of the metrics to understand since it’s just average error.</li>\n", | |
| " <li> Mean Squared Error (MSE): Mean Squared Error (MSE) is the mean of the squared error. It’s more popular than Mean absolute error because the focus is geared more towards large errors. This is due to the squared term exponentially increasing larger errors in comparison to smaller ones.</li>\n", | |
| " <li> Root Mean Squared Error (RMSE): This is the square root of the Mean Square Error. </li>\n", | |
| " <li> R-squared is not error, but is a popular metric for accuracy of your model. It represents how close the data are to the fitted regression line. The higher the R-squared, the better the model fits your data. Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse).</li>\n", | |
| "</ul>" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": { | |
| "button": false, | |
| "collapsed": true, | |
| "deletable": true, | |
| "jupyter": { | |
| "outputs_hidden": true | |
| }, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| }, | |
| "scrolled": true | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "from sklearn.metrics import r2_score\n", | |
| "\n", | |
| "test_x = np.asanyarray(test[['ENGINESIZE']])\n", | |
| "test_y = np.asanyarray(test[['CO2EMISSIONS']])\n", | |
| "test_y_hat = regr.predict(test_x)\n", | |
| "\n", | |
| "print(\"Mean absolute error: %.2f\" % np.mean(np.absolute(test_y_hat - test_y)))\n", | |
| "print(\"Residual sum of squares (MSE): %.2f\" % np.mean((test_y_hat - test_y) ** 2))\n", | |
| "print(\"R2-score: %.2f\" % r2_score(test_y_hat , test_y) )" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "button": false, | |
| "deletable": true, | |
| "new_sheet": false, | |
| "run_control": { | |
| "read_only": false | |
| } | |
| }, | |
| "source": [ | |
| "<h2>Want to learn more?</h2>\n", | |
| "\n", | |
| "IBM SPSS Modeler is a comprehensive analytics platform that has many machine learning algorithms. It has been designed to bring predictive intelligence to decisions made by individuals, by groups, by systems – by your enterprise as a whole. A free trial is available through this course, available here: <a href=\"http://cocl.us/ML0101EN-SPSSModeler\">SPSS Modeler</a>\n", | |
| "\n", | |
| "Also, you can use Watson Studio to run these notebooks faster with bigger datasets. Watson Studio is IBM's leading cloud solution for data scientists, built by data scientists. With Jupyter notebooks, RStudio, Apache Spark and popular libraries pre-packaged in the cloud, Watson Studio enables data scientists to collaborate on their projects without having to install anything. Join the fast-growing community of Watson Studio users today with a free account at <a href=\"https://cocl.us/ML0101EN_DSX\">Watson Studio</a>\n", | |
| "\n", | |
| "<h3>Thanks for completing this lesson!</h3>\n", | |
| "\n", | |
| "<h4>Author: <a href=\"https://ca.linkedin.com/in/saeedaghabozorgi\">Saeed Aghabozorgi</a></h4>\n", | |
| "<p><a href=\"https://ca.linkedin.com/in/saeedaghabozorgi\">Saeed Aghabozorgi</a>, PhD is a Data Scientist in IBM with a track record of developing enterprise level applications that substantially increases clients’ ability to turn data into actionable knowledge. He is a researcher in data mining field and expert in developing advanced analytic methods like machine learning and statistical modelling on large datasets.</p>\n", | |
| "\n", | |
| "<hr>\n", | |
| "\n", | |
| "<p>Copyright © 2018 <a href=\"https://cocl.us/DX0108EN_CC\">Cognitive Class</a>. This notebook and its source code are released under the terms of the <a href=\"https://bigdatauniversity.com/mit-license/\">MIT License</a>.</p>" | |
| ] | |
| } | |
| ], | |
| "metadata": { | |
| "kernelspec": { | |
| "display_name": "Python", | |
| "language": "python", | |
| "name": "conda-env-python-py" | |
| }, | |
| "language_info": { | |
| "codemirror_mode": { | |
| "name": "ipython", | |
| "version": 3 | |
| }, | |
| "file_extension": ".py", | |
| "mimetype": "text/x-python", | |
| "name": "python", | |
| "nbconvert_exporter": "python", | |
| "pygments_lexer": "ipython3", | |
| "version": "3.6.7" | |
| }, | |
| "widgets": { | |
| "state": {}, | |
| "version": "1.1.2" | |
| } | |
| }, | |
| "nbformat": 4, | |
| "nbformat_minor": 4 | |
| } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment