I am trying to create some rolling window features, after applying group by
, like below:
my_data = [['2020-01-01', 'A' , '15'],
['2020-01-01', 'B' , '150'],
['2020-01-02', 'A' , '1'],
['2020-01-02', 'B' , '10'],
['2020-01-03', 'A' , '4'],
['2020-01-03', 'B' , '40'],
['2020-01-04', 'A' , '5'],
['2020-01-04', 'B' , '50'],
['2020-01-05', 'A' , '13'],
['2020-01-05', 'B' , '130'],
['2020-01-06', 'A' , '2'],
['2020-01-06', 'B' , '20'],
['2020-01-07', 'A' , '14'],
['2020-01-07', 'B' , '140'],
['2020-01-08', 'A' , '8'],
['2020-01-08', 'B' , '80'],
['2020-01-09', 'A' , '6'],
['2020-01-09', 'B' , '60'],
['2020-01-10', 'A' , '4'],
['2020-01-10', 'B' , '40'],
['2020-01-11', 'A' , '3'],
['2020-01-11', 'B' , '30'],
['2020-01-12', 'A' , '5'],
['2020-01-12', 'B' , '50'],
['2020-01-13', 'A' , '12'],
['2020-01-13', 'B' , '120'],
['2020-01-14', 'A' , '3'],
['2020-01-14', 'B' , '33'],
['2020-01-15', 'A' , '53'],
['2020-01-15', 'B' , '23'],
['2020-01-16', 'A' , '34'],
['2020-01-16', 'B' , '26'],
['2020-01-17', 'A' , '98'],
['2020-01-17', 'B' , '24'],
['2020-01-18', 'A' , '90'],
['2020-01-18', 'B' , '902'],
['2020-01-19', 'A' , '42'],
['2020-01-19', 'B' , '40'],
['2020-01-20', 'A' , '1'],
['2020-01-20', 'B' , '4'],
['2020-01-21', 'A' , '23'],
['2020-01-21', 'B' , '98'],
['2020-01-22', 'A' , '23'],
['2020-01-22', 'B' , '10'],
]
df_1= pd.DataFrame(my_data, columns = ['Date', 'type', 'value'])
df_1['weekday'] = pd.to_datetime(df_1['Date']).dt.weekday
df_1.set_index(['Date'], inplace=True)
Here I group baased on the type and then merge the results together:
_grouped = df_1.groupby(["type"])['value'].rolling(2).mean()
df_2d_mean = pd.DataFrame(_grouped)
df_2d_mean = df_2d_mean.rename(columns={"value":"Mean 2D"})
df_2d_mean
result_df = df_1.copy()
result_df = result_df.merge(df_2d_mean, on=['Date', 'type'])
result_df
And the result look like this:
type value weekday Mean 2D
Date
2020-01-01 A 15 2 NaN
2020-01-01 B 150 2 NaN
2020-01-02 A 1 3 8.0
2020-01-02 B 10 3 80.0
2020-01-03 A 4 4 2.5
2020-01-03 B 40 4 25.0
2020-01-04 A 5 5 4.5
2020-01-04 B 50 5 45.0
2020-01-05 A 13 6 9.0
2020-01-05 B 130 6 90.0
2020-01-06 A 2 0 7.5
2020-01-06 B 20 0 75.0
....
I would like to create 7 features "Mean 2D", such as "Mean 2D Monday", "Mean 2D Tuesday", etc., which will show the rolling window mean of only the previous 2 Mondays, 2 Tuesday, etc.. So I would like to perform a filter on weekday, to calculate the rolling window mean for each weekday...
Could you please provide any suggestion?