Python の線形回帰(信頼区間95 %)
アットマーク@ は .dot() の代わりらしい(内積)。
np.array([1,2]) @ np.array([2,3])
単回帰について
from sklearn.linear_model import LinearRegression X = np.array([167, 168, 168, 183, 170, 165, 163, 173, 177, 170]) y = np.array([59, 58, 65, 76, 62, 53, 59, 70, 62, 62]) X_tr = X[:, np.newaxis] reg = LinearRegression().fit(X_tr, y) print(reg.coef_, reg.intercept_) # (array([0.88685209]), -88.51959544879895) # 残差 _e = y - pred_y xr = [160, 190] plt.scatter(X, y) plt.plot(xr, reg.coef_ * xr + reg.intercept_)
import statsmodels.api as sm x = sm.add_constant(X) model = sm.OLS(y, x).fit() # 信頼区間(係数の数の from と to) _conf_int = model.conf_int(alpha=0.05, cols=None) print(_conf_int) # array([[-178.1483189 , 1.109128 ], # [ 0.36114827, 1.4125559 ]]) # summry # summary = model.get_prediction(x).summary_frame(alpha=0.05) model.summary()
OLS Regression Results Dep. Variable: y R-squared: 0.654 Model: OLS Adj. R-squared: 0.611 Method: Least Squares F-statistic: 15.13 Date: Sun, 24 Mar 2019 Prob (F-statistic): 0.00461 Time: 13:09:36 Log-Likelihood: -27.073 No. Observations: 10 AIC: 58.15 Df Residuals: 8 BIC: 58.75 Df Model: 1 Covariance Type: nonrobust coef std err t P>|t| [0.025 0.975] const -88.5196 38.868 -2.277 0.052 -178.148 1.109 x1 0.8869 0.228 3.890 0.005 0.361 1.413 Omnibus: 0.490 Durbin-Watson: 2.445 Prob(Omnibus): 0.783 Jarque-Bera (JB): 0.527 Skew: -0.279 Prob(JB): 0.768 Kurtosis: 2.024 Cond. No. 5.17e+03
- 信頼区間のplot
import seaborn as sns sns.regplot(X, y, ci=95);