Day 47

Aug 10, 2017 · 242 words

Very brief post today as I am really short on time.

Our team met at AHS today but we quickly realized without access to the logged power data there’s not a whole lot we can do to start integrating the predictions.

I spent today starting to move the prediction process into it’s own class. My goal is to make the prediction process as clean and simple as possible. This is actually my first time using Python classes so it’s probably not very elegant. Nonetheless, I’ve made some progress on a few of the internal functions:

class rf_model:
    def __init__(self, data, td_input, td_gap, td_output,time_attrs):
        self.data = data
        self.n = len(data)
        self.data_freq = data.index.freq
        self.td_input = td_input
        self.td_gap = td_gap
        self.td_output = td_output
        self.time_attrs = time_attrs
        self.rf = RandomForestRegressor(n_estimators=10)

    def _time_features(self,attrs):
        df_f = pd.DataFrame(index=self.data.index)
        for attr in attrs:
            df_f[attr] = getattr(df_f.index,attr)
        return df_f

    def _index_windows(self):
        ixs = np.array(range(self.n))
        input_size = int(self.td_input / self.data_freq)
        gap_size = int(self.td_gap / self.data_freq)
        output_size = int(self.td_output / self.data_freq)
        ix_windows = egz.rolling_window(ixs,
                                    input_size
                                    + gap_size
                                    + output_size,
                                    output_size)
        X_ixs,_,y_ixs = np.split(ix_windows,[input_size,input_size+gap_size],1)
        return X_ixs,y_ixs

    def _training_arrays(self):
        X_ixs,y_ixs = self._index_windows()
        time_feat = self._time_features(self.time_attrs).as_matrix()
        X = np.concatenate((np.array([self.data[w] for w in X_ixs])[:,::int(pd.Timedelta(hours=1)/self.data_freq)],
                        np.array([time_feat[w] for w in y_ixs[:,0]])),
            axis=1)
        y = np.array([self.data[w] for w in y_ixs])
        return X,y

You can pass in a list of time attributes that you want to use as features and the object will appropriately pull those features from the training data index, combining them with the downsampled historical value features.

comments powered by Disqus