python - Pandas Apply groupby function to every column efficiently -


this question has answer here:

in pandas can apply groupby functions every column in dataframe such in case of:

pt=df.groupby(['group']).sum().reset_index() 

lets want apply lambda function lambda x: (0 < x).sum() count cells value in them , include count of total items in each group. there more efficient way apply columns other repeating code:

import pandas pd  df=pd.dataframe({'group':['w', 'w', 'w', 'e','e','e','n'], 'a':[0,1,5,0,1,5,7], 'b':[1,0,5,0,0,2,0], 'c':[1,1,5,0,0,5,0], 'total':[2,2,15,0,1,12,7] })  #check how many items present in group grp=df.groupby(['group']) pt1 = grp['a'].apply(lambda x: (0 < x).sum()).reset_index() pt2 = grp['b'].apply(lambda x: (0 < x).sum()).reset_index() pt3 = grp['c'].apply(lambda x: (0 < x).sum()).reset_index()  pct=pd.merge(pt1, pt2, on=['group']) pct=pd.merge(pt2, pct, on=['group'])  #get total items , merge counts pt = df.groupby(['group'])['total'].count().reset_index() pct=pd.merge(pt, pct, on=['group']) 

output:

  group  total  c   b 0     e      3  1  2  1 1     n      1  0  1  0 2     w      3  3  2  2 

what efficient way write n columns?

the cleanest way can think of this:

(df > 0).groupby(df['group']).agg({'a': 'sum', 'b': 'sum', 'c': 'sum', 'total': 'count'}) out:           c  total    b    group                       e      1.0      3  1.0  2.0 n      0.0      1  0.0  1.0 w      3.0      3  2.0  2.0 

you can sort , cast int if want:

((df > 0).groupby(df['group']).agg({'a': 'sum', 'b': 'sum', 'c': 'sum', 'total': 'count'})                               .sort_index(axis=1).astype('int') out:          b  c  total group                 e      2  1  1      3 n      1  0  0      1 w      2  2  3      3 

Comments

Popular posts from this blog

javascript - Thinglink image not visible until browser resize -

firebird - Error "invalid transaction handle (expecting explicit transaction start)" executing script from Delphi -

mongodb - How to keep track of users making Stripe Payments -