FirstCity
Welcome to First City University College Library iPortal | library@firstcity.edu.my | +603-7735 2088 (Ext. 519)
Amazon cover image
Image from Amazon.com

Regression analysis with Python : learn the art of regression analysis with Python / Luca Massaron, Alberto Boschetti.

By: Contributor(s): Material type: TextTextSeries: Community experience distilledPublisher: Birmingham, UK : Packt Publishing, 2016Description: 1 online resource (1 volume) : illustrationsContent type:
  • text
Media type:
  • computer
Carrier type:
  • online resource
ISBN:
  • 9781783980741
  • 1783980745
Subject(s): Genre/Form: DDC classification:
  • 519.5/36 23
LOC classification:
  • QA278.2
Online resources:
Contents:
Cover; Copyright; Credits; About the Authors; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Regression -- The Workhorse of Data Science; Regression analysis and data science; Exploring the promise of data science; The challenge; The linear models; What you are going to find in the book ; Python for data science; Installing Python; Choosing between Python 2 and Python 3; Step-by-step installation; Installing packages; Package upgrades; Scientific distributions; Introducing Jupyter or IPython; Python packages and functions for linear models ; NumPy; SciPy
StatsmodelsScikit-learn; Summary; Chapter 2: Approaching Simple Linear Regression; Defining a regression problem; Linear models and supervised learning; Reflecting on predictive variables; Reflecting on response variables; The family of linear models; Preparing to discover simple linear regression; Starting from the basics; A measure of linear relationship; Extending to linear regression; Regressing with StatsModels; The coefficient of determination; Meaning and significance of coefficients; Evaluating the fitted values; Correlation is not causation; Predicting with a regression model
Regressing with Scikit-learnMinimizing the cost function; Explaining the reason for using squared errors; Pseudoinverse and other optimization methods; Gradient Descent at work; Summary; Chapter 3: Multiple Regression in Action; Using multiple features; Model building with Statsmodels; Using formulas as an alternative; The correlation matrix; Revisiting gradient descent; Feature scaling; Unstandardizing coefficients; Estimating feature importance; Inspecting standardized coefficients; Comparing models by R-squared; Interaction models; Discovering interactions; Polynomial regression
Testing linear versus cubic transformationGoing for higher-degree solutions; Introducing underfitting and overfitting; Summary; Chapter 4: Logistic Regression; Defining a classification problem; Formalization of the problem: binary classification; Assessing the classifier's performance; Defining a probability-based approach; More on the logistic and logit functions; Let's see some code; Pros and cons of logistic regression; Revisiting Gradient Descend; Multiclass Logistic Regression; An example; Summary; Chapter 5: Data Preparation; Numeric feature scaling; Mean centering; Standardization
NormalizationThe logistic regression case; Qualitative feature encoding; Dummy coding with Pandas; DictVectorizer and one-hot encoding; Feature hasher; Numeric feature transformation; Observing residuals; Summarizations by binning; Missing data; Missing data imputation; Keeping track of missing values; Outliers; Outliers on the response; Outliers among the predictors; Removing or replacing outliers; Summary; Chapter 6: Achieving Generalization; Checking on out-of-sample data; Testing by sample split; Cross-validation; Bootstrapping; Greedy selection of features ; The Madelon dataset
Star ratings
    Average rating: 0.0 (0 votes)
No physical items for this record

Description based on online resource; title from cover (viewed March 23, 2016).

Includes index.

Cover; Copyright; Credits; About the Authors; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Regression -- The Workhorse of Data Science; Regression analysis and data science; Exploring the promise of data science; The challenge; The linear models; What you are going to find in the book ; Python for data science; Installing Python; Choosing between Python 2 and Python 3; Step-by-step installation; Installing packages; Package upgrades; Scientific distributions; Introducing Jupyter or IPython; Python packages and functions for linear models ; NumPy; SciPy

StatsmodelsScikit-learn; Summary; Chapter 2: Approaching Simple Linear Regression; Defining a regression problem; Linear models and supervised learning; Reflecting on predictive variables; Reflecting on response variables; The family of linear models; Preparing to discover simple linear regression; Starting from the basics; A measure of linear relationship; Extending to linear regression; Regressing with StatsModels; The coefficient of determination; Meaning and significance of coefficients; Evaluating the fitted values; Correlation is not causation; Predicting with a regression model

Regressing with Scikit-learnMinimizing the cost function; Explaining the reason for using squared errors; Pseudoinverse and other optimization methods; Gradient Descent at work; Summary; Chapter 3: Multiple Regression in Action; Using multiple features; Model building with Statsmodels; Using formulas as an alternative; The correlation matrix; Revisiting gradient descent; Feature scaling; Unstandardizing coefficients; Estimating feature importance; Inspecting standardized coefficients; Comparing models by R-squared; Interaction models; Discovering interactions; Polynomial regression

Testing linear versus cubic transformationGoing for higher-degree solutions; Introducing underfitting and overfitting; Summary; Chapter 4: Logistic Regression; Defining a classification problem; Formalization of the problem: binary classification; Assessing the classifier's performance; Defining a probability-based approach; More on the logistic and logit functions; Let's see some code; Pros and cons of logistic regression; Revisiting Gradient Descend; Multiclass Logistic Regression; An example; Summary; Chapter 5: Data Preparation; Numeric feature scaling; Mean centering; Standardization

NormalizationThe logistic regression case; Qualitative feature encoding; Dummy coding with Pandas; DictVectorizer and one-hot encoding; Feature hasher; Numeric feature transformation; Observing residuals; Summarizations by binning; Missing data; Missing data imputation; Keeping track of missing values; Outliers; Outliers on the response; Outliers among the predictors; Removing or replacing outliers; Summary; Chapter 6: Achieving Generalization; Checking on out-of-sample data; Testing by sample split; Cross-validation; Bootstrapping; Greedy selection of features ; The Madelon dataset

eBooks on EBSCOhost EBSCO eBook Subscription Academic Collection - Worldwide