Python の線形回帰（信頼区間95 %） - 日に日に分からんことが増えていく…

アットマーク@ は .dot() の代わりらしい（内積）。

np.array([1,2]) @ np.array([2,3])

単回帰について

from sklearn.linear_model import LinearRegression

X = np.array([167, 168, 168, 183, 170, 165, 163, 173, 177, 170])
y = np.array([59, 58, 65, 76, 62, 53, 59, 70, 62, 62])

X_tr = X[:, np.newaxis]

reg = LinearRegression().fit(X_tr, y)
print(reg.coef_, reg.intercept_)
# (array([0.88685209]), -88.51959544879895)

# 残差
_e = y - pred_y

xr = [160, 190]
plt.scatter(X, y)
plt.plot(xr, reg.coef_ * xr + reg.intercept_)

import statsmodels.api as sm

x = sm.add_constant(X)
model = sm.OLS(y, x).fit()

# 信頼区間（係数の数の from と to）
_conf_int = model.conf_int(alpha=0.05, cols=None)
print(_conf_int)
# array([[-178.1483189 ,    1.109128  ],
#        [   0.36114827,    1.4125559 ]])

# summry
# summary = model.get_prediction(x).summary_frame(alpha=0.05)
model.summary()

OLS Regression Results
Dep. Variable:  y   R-squared:  0.654
Model:  OLS Adj. R-squared: 0.611
Method: Least Squares   F-statistic:    15.13
Date:   Sun, 24 Mar 2019    Prob (F-statistic): 0.00461
Time:   13:09:36    Log-Likelihood: -27.073
No. Observations:   10  AIC:    58.15
Df Residuals:   8   BIC:    58.75
Df Model:   1       
Covariance Type:    nonrobust       
coef    std err t   P>|t|    [0.025  0.975]
const   -88.5196    38.868  -2.277  0.052   -178.148    1.109
x1  0.8869  0.228   3.890   0.005   0.361   1.413
Omnibus:    0.490   Durbin-Watson:  2.445
Prob(Omnibus):  0.783   Jarque-Bera (JB):   0.527
Skew:   -0.279  Prob(JB):   0.768
Kurtosis:   2.024   Cond. No.   5.17e+03

信頼区間のplot

import seaborn as sns
sns.regplot(X, y, ci=95);

www.statsmodels.org

seaborn.pydata.org