更新时间:2021-06-11 13:23:01
封面
版权页
Preface
About
About the Book
Chapter 1: R for Advanced Analytics
Introduction
Working with Real-World Datasets
Reading Data from Various Data Formats
Write R Markdown Files for Code Reproducibility
Data Structures in R
DataFrame
Data Processing and Transformation
The Apply Family of Functions
Useful Packages
Data Visualization
Line Charts
Histogram
Boxplot
Summary
Chapter 2: Exploratory Analysis of Data
Defining the Problem Statement
Understanding the Science Behind EDA
Exploratory Data Analysis
Univariate Analysis
Exploring Categorical Features
Bivariate Analysis
Studying the Relationship between Two Numeric Variables
Studying the Relationship between a Categorical and a Numeric Variable
Studying the Relationship Between Two Categorical Variables
Multivariate Analysis
Validating Insights Using Statistical Tests
Categorical Dependent and Numeric/Continuous Independent Variables
Categorical Dependent and Categorical Independent Variables
Chapter 3: Introduction to Supervised Learning
Summary of the Beijing PM2.5 Dataset
Regression and Classification Problems
Machine Learning Workflow
Regression
Exploratory Data Analysis (EDA)
Classification
Evaluation Metrics
Chapter 4: Regression
Linear Regression
Model Diagnostics
Residual versus Fitted Plot
Normal Q-Q Plot
Scale-Location Plot
Residual versus Leverage
Improving the Model
Quantile Regression
Polynomial Regression
Ridge Regression
LASSO Regression
Elastic Net Regression
Poisson Regression
Cox Proportional-Hazards Regression Model
NCCTG Lung Cancer Data
Chapter 5: Classification
Getting Started with the Use Case
Classification Techniques for Supervised Learning
Logistic Regression
How Does Logistic Regression Work?
Evaluating Classification Models
What Metric Should You Choose?
Evaluating Logistic Regression
Decision Trees
XGBoost
Deep Neural Networks
Choosing the Right Model for Your Use Case
Chapter 6: Feature Selection and Dimensionality Reduction
Feature Engineering
One-Hot Encoding
Log Transformation
Feature Selection
Highly Correlated Variables
Feature Reduction
Variable Clustering
Linear Discriminant Analysis for Feature Reduction
Chapter 7: Model Improvements
Bias-Variance Trade-off
Underfitting and Overfitting
Defining a Sample Use Case
Cross-Validation
Holdout Approach/Validation
K-Fold Cross-Validation
Hold-One-Out Validation
Hyperparameter Optimization
Grid Search Optimization