I want to use the 20 newsgroups dataset to test an algorithm, and analysis the significant words for each group.
In the website provided by University of Toronto. But I can't find the correspond vocabulary file for this dataset. So is there anyone else could give me a light?