Skip to content Skip to sidebar Skip to footer

Pandas Multicolumn Groupby Plotting

Problem: I have a pandas dataframe of data that I would like to group-by year-months and rule_name. Once grouped by I want to be able to get the counts of each of the rules during

Solution 1:

I was able to resolve this. The following code provides the necessary plots and data processing. I am putting it up in case this helps someone else. It feels kind of janky but it gets the trick done. Any suggestion to improve this would be appreciated.

Thanks SO.

import seaborn as sns

df_all = df.groupby(df['date'].map(lambda x: str(x.year) + '-' + str(x.strftime('%m')))).count()
df_all = pd.DataFrame(df_all)
df_all['rule_name_all_count'] = df_all['rule_name']

rule_names = df['rule_name'].unique().tolist()
for i in rule_names:
    print""print'dataframe for', i ,':'
    df_temp = df[df['rule_name'] == i]
    df_temp = df_temp.groupby(df_temp['date'].map(lambda x: str(x.year) + '-' + str(x.strftime('%m')))).count()
    df_temp = pd.DataFrame(df_temp)
    df_merge = pd.merge(df_all, df_temp, right_index = True, left_index = True, how='left')
    drop_x(df_merge)
    rename_y(df_merge)
    df_merge.drop('date', axis=1, inplace=True)
    df_merge['rule_name_%'] = df_merge['rule_name'].astype(float) / df_merge['rule_name_all_count'].astype(float)
    df_merge = df_merge.fillna(0) 

    fig = plt.figure()
    ax = fig.add_subplot(111)
    ax2 = ax.twinx()

    df_merge['rule_name'].plot()
    df_merge['rule_name_%'].plot()
    plt.show()
    print df_temp

enter image description here

enter image description here

Post a Comment for "Pandas Multicolumn Groupby Plotting"