0

So I've 4 different dictionaries, d1, d2, d3, d4, all of which I've created tuple pairs for how common the term occurs in my text. Each of these different dictionaries are tagged by an entity type which is why they're all in different dictionaries. I am then able to sort the dictionaries and find the highest occuring. Though now I've got all of the highest occuring of each of these entity types, I wish to find the highest occurring amongst all of these entity types.

I can't sort it in the same way now as it is a list, not a dictionary, I also can't simply add together the previous dictionaries.

def t(tokens, pos, ner):
    entities={}
    in_entity=False
    for i, (token,tag) in enumerate (zip(tokens,pos)):
        if tag == ner:
            if in_entity:
                entity+=" "+token
            else:
                entity=token
                in_entity=True
        elif in_entity:
            entities[entity]=entities.get(entity,0)+1
            in_entity = False
    return entities

1a = t(tokens,ner,"A")
top_1a = sorted(1a.items(), key=operator.itemgetter(1), reverse= True) [:10]
print (top_1a)

2b = t(tokens,ner,"B")
top_2b = sorted(2b.items(), key=operator.itemgetter(1), reverse= True) [:10]
print (top_2b)

3c = t(tokens,ner,"C")
top_3c = sorted(3c.items(), key=operator.itemgetter(1), reverse= True) [:10]
print (top_3c)

4d = t(tokens,ner,"D")
top_4d = sorted(4d.items(), key=operator.itemgetter(1), reverse= True) [:10]
print (top_4d)

These above all work perfectly to sort my dictionaries into an ordered list of the top 10 most occuring. Though now I want the top 10 from all of the now-lists.

top_o = top_1a + top_2b + top_3c + top_4d
top_fin = sorted(top_o.items(), key=operator.itemgetter(1), reverse = True) [:10]
print(top_fin)

I've tried that though as it is no longer a dictionary, instead a list .items does not work. top_o prints successfully (with each list respectively ordered as one larger list), though how do I now re-order it again?

MattSt
  • 1,024
  • 2
  • 16
  • 35
bemzoo
  • 172
  • 14

0 Answers0