Skip to content

Regression 核心知识点 (Core Knowledge Points)

定义 (Definition)

  • 回归是用于预测连续目标变量的统计方法。
    (Regression is a statistical method used to predict a continuous target variable.)

常见类型 (Common Types)

  • 线性回归 (Linear Regression)
    通过拟合一条直线来预测目标变量。
    (Predicts the target variable by fitting a straight line.) Simple Linear Regression

  • 多元回归 (Multiple Regression)
    扩展线性回归,使用多个自变量。
    (Extends linear regression by using multiple independent variables.)

Multiple Linear Regression

  • 多项式回归 (Polynomial Regression)
    使用多项式函数拟合数据,以捕捉非线性关系。
    (Fits data using polynomial functions to capture non-linear relationships.)

  • 岭回归 (Ridge Regression)
    增加L2正则化项以防止过拟合。
    (Adds an L2 regularization term to prevent overfitting.)

  • 拉索回归 (Lasso Regression)
    增加L1正则化项,以执行特征选择。
    (Adds an L1 regularization term to perform feature selection.)

  • 弹性网回归 (Elastic Net Regression)
    结合L1和L2正则化的优点。
    (Combines the benefits of L1 and L2 regularization.)

关键概念 (Key Concepts)

  • 假设 (Assumptions)
    线性关系、独立性、同方差性和正态性。
    (Linearity, Independence, Homoscedasticity, and Normality.)

  • 残差 (Residuals)
    真实值与预测值之间的差异。
    (Difference between actual and predicted values.)

  • R平方 (R-squared)
    解释目标变量方差的比例。
    (Proportion of the variance in the target variable that is explained by the model.)

  • 调整后的R平方 (Adjusted R-squared)
    调整R平方以考虑模型复杂度。
    (Adjusts R-squared to account for the complexity of the model.)

  • 均方误差 (Mean Squared Error, MSE)
    残差的平均平方。
    (Average of the squared differences between actual and predicted values.)

  • 均方根误差 (Root Mean Squared Error, RMSE)
    MSE的平方根,提供与数据同单位的误差度量。
    (Square root of MSE, providing an error metric in the same units as the data.)

  • 均绝对误差 (Mean Absolute Error, MAE)
    残差的平均绝对值。
    (Average of the absolute differences between actual and predicted values.)

优点 (Advantages)

  • 易于理解和解释 (Easy to understand and interpret)
  • 计算效率高 (Computationally efficient)
  • 适用于线性关系 (Effective for linear relationships)

缺点 (Disadvantages)

  • 对异常值敏感 (Sensitive to outliers)
  • 不能捕捉非线性关系 (Cannot capture non-linear relationships)
  • 依赖于假设的满足 (Dependent on assumption satisfaction)

应用 (Applications)

  • 价格预测 (Price Prediction)
  • 销售预测 (Sales Forecasting)
  • 风险评估 (Risk Assessment)
  • 医疗成本预测 (Medical Cost Prediction)