Pandas: word frequency by months

Question

I'm exploring a database built like this

So it's basically a collection of Youtube comments, that I have started to analyse: I've managed to ad column counting the number of words by comment, as well as another one for ngrams (which I intend to explore later). I've managed to get a list of the 10 most frequent words for the whole period, but I've been unable to get the word frequency by months: for each month, I would like to get a list of the 10 most frequent words.

Thanks for your help!

Don't paste images as Input. Anyway TO get this task done filter your dataframe by month then take top 10 words from series. — Mohamed Thasin ah, Nov 03 '18 at 18:05

Mohamed Thasin ah · Accepted Answer · 2018-11-03T18:23:50.113

3

I hope you can try this,

import pandas as pd from collections import Counter

Option-1:

df=df.set_index(df['at'])
for u,v in df.groupby(pd.Grouper(freq="M")):
    words=sum(v['text'].str.split(' ').values.tolist(),[])
    c = Counter(words)
    print c.most_common(10)

Option-2:

df=df.set_index(df['at'])
for u,v in df.groupby(pd.Grouper(freq="M")):
    words=sum(v['text'].str.split(' ').values.tolist(),[])
    top_words=pd.Series(words).value_counts()[:10]
    print top_words.index.tolist()

edited Nov 03 '18 at 18:23

answered Nov 03 '18 at 18:16

Mohamed Thasin ah

10,754
11
52
111

1

This is great! Both are working very well! Thanks a lot! Just so I can understand it better, what do u and v stand for in your loop? I don't really understand how it works... – Pauline Ziserman Nov 03 '18 at 18:54
1

@PaulineZiserman - `pd.Grouper(freq="M")` it group your dataframe by month wise, i.e., each iteration contains each month data. V contains filtered dataframe, U contains name of the group. For more details visit, https://stackoverflow.com/questions/27405483/how-to-loop-over-grouped-pandas-dataframe – Mohamed Thasin ah Nov 03 '18 at 18:59

Pandas: word frequency by months

1 Answers1