Brussels. Step 1.c Micro-level Electricity demand model

In [1]:
import datetime; print(datetime.datetime.now())
2018-03-26 02:08:35.436887

Notebook Abstract:

A simple micro-level electricity demand model. The electricity demand model used available micro level data for the estimation of regression coefficients. This regression coefficients are used to define a table model. The electricity table model is used for the construction of a proxy micro level sample data set.

Prior electricity demand model

In [2]:
import statsmodels.api as sm
import pandas as pd
import numpy as np
from smum._scripts.micro import compute_categories, change_index
/usr/lib/python3.6/site-packages/statsmodels-0.8.0-py3.6-linux-x86_64.egg/statsmodels/compat/pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead.
  from pandas.core import datetools
In [3]:
electricity_data = pd.read_csv('data/elec_data_be.csv', index_col=0)
#formula = "KWH ~ TOTSQMT_EN + CDD30YR + HDD30YR + MONEYPY + NHSLDMEM +\
# C(TYPEHUQ, Treatment(reference='House')) + YEARMADE + ELWARM + ELWATER + ELFOOD"
formula = "KWH ~ TOTSQMT_EN + CDD30YR + HDD30YR + MONEYPY + NHSLDMEM +\
 C(TYPEHUQ) + YEARMADE + ELWARM + ELWATER + ELFOOD"
In [4]:
electricity_data.head()
Out[4]:
TYPEHUQ NWEIGHT HDD65 CDD65 HDD30YR CDD30YR Climate_Region_Pub UR KOWNRENT YEARMADE ... DOLLARLP GALLONFO DOLLARFO GALLONKER BTUKER DOLLARKER BTUWOOD TOTALBTU TOTALDOL TOTSQMT_EN
DOEID
1 House 2471.68 4742.0 1080.0 4953.0 1271.0 4 U Owned 2004 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 63006.0 1315.0 434.775
2 House 8599.17 2662.0 199.0 2688.0 143.0 5 U Rented 1998 ... 0.0 0.0 0.0 0.0 0.0 0.0 20000.0 103460.0 1293.0 254.448
3 Apartment 8969.92 6233.0 505.0 5741.0 829.0 1 U Rented 1965 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 58716.0 1327.0 49.104
4 House 18003.64 6034.0 672.0 5781.0 868.0 1 U Owned 1985 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 76401.0 1398.0 150.939
5 House 5999.61 5388.0 702.0 5313.0 797.0 1 U Owned 1983 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 59809.0 1558.0 177.816

5 rows × 51 columns

In [5]:
model = sm.WLS.from_formula(formula, electricity_data, weights=electricity_data.NWEIGHT)
model_results = model.fit()
#params = model_results.params
#bse = model_results.bse
params = change_index(model_results.params)
bse = change_index(model_results.bse)
In [6]:
model_results.summary()
Out[6]:
WLS Regression Results
Dep. Variable: KWH R-squared: 0.451
Model: WLS Adj. R-squared: 0.450
Method: Least Squares F-statistic: 946.8
Date: Mon, 26 Mar 2018 Prob (F-statistic): 0.00
Time: 02:08:38 Log-Likelihood: -1.1684e+05
No. Observations: 11542 AIC: 2.337e+05
Df Residuals: 11531 BIC: 2.338e+05
Df Model: 10
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
Intercept -2.996e+04 4347.551 -6.891 0.000 -3.85e+04 -2.14e+04
C(TYPEHUQ)[T.House] 2752.5037 138.890 19.818 0.000 2480.256 3024.751
TOTSQMT_EN 16.7486 0.524 31.943 0.000 15.721 17.776
CDD30YR 1.7152 0.089 19.301 0.000 1.541 1.889
HDD30YR 0.1985 0.041 4.886 0.000 0.119 0.278
MONEYPY 0.0323 0.003 12.110 0.000 0.027 0.038
NHSLDMEM 1037.7339 36.334 28.561 0.000 966.512 1108.955
YEARMADE 12.9729 2.215 5.858 0.000 8.632 17.314
ELWARM 2296.3175 119.387 19.234 0.000 2062.298 2530.337
ELWATER 3660.4636 130.959 27.951 0.000 3403.761 3917.166
ELFOOD 1396.9598 123.022 11.355 0.000 1155.816 1638.104
Omnibus: 8161.961 Durbin-Watson: 1.969
Prob(Omnibus): 0.000 Jarque-Bera (JB): 474206.898
Skew: 2.788 Prob(JB): 0.00
Kurtosis: 33.902 Cond. No. 3.26e+06
In [7]:
elec = pd.concat([params, bse], axis=1)
elec.columns = ['co_mu', 'co_sd']
In [8]:
elec.index
Out[8]:
Index(['Intercept', 'TYPEHUQ_House', 'TOTSQMT_EN', 'CDD30YR', 'HDD30YR',
       'MONEYPY', 'NHSLDMEM', 'YEARMADE', 'ELWARM', 'ELWATER', 'ELFOOD'],
      dtype='object')
In [9]:
elec.loc[:, 'p'] = np.nan
elec.loc[:,'mu'] = np.nan
elec.loc[:,'sd'] = np.nan
elec.loc[:, 'dis'] = 'Normal'
elec.loc[['CDD30YR','HDD30YR'], 'dis'] = 'Deterministic'
elec.loc['MONEYPY', 'dis'] = 'None'
#elec.loc[["TYPEHUQ, Treatmentreference='House'_Apartment", 'ELWARM', 'ELWATER', 'ELFOOD'], 'dis'] = 'None'
elec.loc[["TYPEHUQ_House", 'ELWARM', 'ELWATER', 'ELFOOD'], 'dis'] = 'None'
elec.loc[['NHSLDMEM', 'YEARMADE'], 'dis'] = 'None'
elec.loc[:,'ub'] = np.inf
elec.loc[:,'lb'] = 0
elec.loc['NHSLDMEM', 'lb'] = 1
elec.loc['NHSLDMEM', 'ub'] = 8
elec.loc['YEARMADE', 'lb'] = 1800
elec.loc['YEARMADE', 'ub'] = 2035
elec.loc['Intercept', 'lb'] = -np.inf
elec.loc['Intercept', 'p'] = elec.loc['Intercept', 'co_mu']
elec.loc['Intercept', 'dis'] = 'Deterministic'
elec.loc['Intercept', 'co_mu'] = np.nan
elec.loc['Intercept', 'co_sd'] = np.nan
In [10]:
elec.index = [
    'e_Intercept',
    'e_ConstructionType',
    'e_sqm',
    'e_CDD',
    'e_HDD',
    'e_Income',
    'e_HHSize',
    'e_ConstructionYear',
    'e_ELWARM',
    'e_ELWATER',
    'e_ELFOOD'
]
In [11]:
skip = [
    'e_ELWARM',
    'e_ELWATER',
    'e_ELFOOD'
]
elec = elec.loc[[i for i in elec.index if i not in skip]]
In [12]:
elec.to_csv('data/table_elec.csv')
In [13]:
elec
Out[13]:
co_mu co_sd p mu sd dis ub lb
e_Intercept NaN NaN -29960.626896 NaN NaN Deterministic inf -inf
e_ConstructionType 2752.503665 138.889881 NaN NaN NaN None inf 0.000000
e_sqm 16.748580 0.524321 NaN NaN NaN Normal inf 0.000000
e_CDD 1.715248 0.088868 NaN NaN NaN Deterministic inf 0.000000
e_HDD 0.198506 0.040625 NaN NaN NaN Deterministic inf 0.000000
e_Income 0.032290 0.002666 NaN NaN NaN None inf 0.000000
e_HHSize 1037.733889 36.334410 NaN NaN NaN None 8.000000 1.000000
e_ConstructionYear 12.972898 2.214585 NaN NaN NaN None 2035.000000 1800.000000