Regression analysis with Python : learn the art of regression analysis with Python / Luca Massaron, Alberto Boschetti.
Material type: TextSeries: Community experience distilledPublisher: Birmingham, UK : Packt Publishing, 2016Description: 1 online resource (1 volume) : illustrationsContent type:- text
- computer
- online resource
- 9781783980741
- 1783980745
- 519.5/36 23
- QA278.2
Description based on online resource; title from cover (viewed March 23, 2016).
Includes index.
Cover; Copyright; Credits; About the Authors; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Regression -- The Workhorse of Data Science; Regression analysis and data science; Exploring the promise of data science; The challenge; The linear models; What you are going to find in the book ; Python for data science; Installing Python; Choosing between Python 2 and Python 3; Step-by-step installation; Installing packages; Package upgrades; Scientific distributions; Introducing Jupyter or IPython; Python packages and functions for linear models ; NumPy; SciPy
StatsmodelsScikit-learn; Summary; Chapter 2: Approaching Simple Linear Regression; Defining a regression problem; Linear models and supervised learning; Reflecting on predictive variables; Reflecting on response variables; The family of linear models; Preparing to discover simple linear regression; Starting from the basics; A measure of linear relationship; Extending to linear regression; Regressing with StatsModels; The coefficient of determination; Meaning and significance of coefficients; Evaluating the fitted values; Correlation is not causation; Predicting with a regression model
Regressing with Scikit-learnMinimizing the cost function; Explaining the reason for using squared errors; Pseudoinverse and other optimization methods; Gradient Descent at work; Summary; Chapter 3: Multiple Regression in Action; Using multiple features; Model building with Statsmodels; Using formulas as an alternative; The correlation matrix; Revisiting gradient descent; Feature scaling; Unstandardizing coefficients; Estimating feature importance; Inspecting standardized coefficients; Comparing models by R-squared; Interaction models; Discovering interactions; Polynomial regression
Testing linear versus cubic transformationGoing for higher-degree solutions; Introducing underfitting and overfitting; Summary; Chapter 4: Logistic Regression; Defining a classification problem; Formalization of the problem: binary classification; Assessing the classifier's performance; Defining a probability-based approach; More on the logistic and logit functions; Let's see some code; Pros and cons of logistic regression; Revisiting Gradient Descend; Multiclass Logistic Regression; An example; Summary; Chapter 5: Data Preparation; Numeric feature scaling; Mean centering; Standardization
NormalizationThe logistic regression case; Qualitative feature encoding; Dummy coding with Pandas; DictVectorizer and one-hot encoding; Feature hasher; Numeric feature transformation; Observing residuals; Summarizations by binning; Missing data; Missing data imputation; Keeping track of missing values; Outliers; Outliers on the response; Outliers among the predictors; Removing or replacing outliers; Summary; Chapter 6: Achieving Generalization; Checking on out-of-sample data; Testing by sample split; Cross-validation; Bootstrapping; Greedy selection of features ; The Madelon dataset
eBooks on EBSCOhost EBSCO eBook Subscription Academic Collection - Worldwide