2

I've looked all over the internet asking the question how can I find all the keys in a dictionary that have the same value. But this value is not known. The closest thing that came up was this, but the values are known.

Say I had a dictionary like this and these values are totally random, not hardcoded by me.

{'AGAA': 2, 'ATAA': 5,'AJAA':2}

How can I identify all the keys with the same value? What would be the most efficient way of doing this.

['AGAA','AJAA']
Buddy Bob
  • 5,829
  • 1
  • 13
  • 44
  • 2
    With what same value? The one that occurs with the highest count? – a_guest May 09 '21 at 22:04
  • 1
    If the values are hashable, it is easy to create a dictionary whose keys are those values and whose values are lists of keys which map to those values. Using `defaultdict(list)` is an easy way to create such a dictionary. Given such a dictionary, you can easily extract the entries which have values which are lists of lengths greater than 1. – John Coleman May 09 '21 at 22:09
  • Well say `ATAA` had the same value as a new key `AZAA`. Then that would count two. As long as a key matches another key's value. – Buddy Bob May 09 '21 at 22:10
  • what would your expected output be? {2:2} ? Because value 2 occured 2 times? – Andreas May 09 '21 at 22:11
  • I provided an output in my question at the very bottom – Buddy Bob May 09 '21 at 22:12

2 Answers2

3

The way I would do it is "invert" the dictionary. By this I mean to group the keys for each common value. So if you start with:

{'AGAA': 2, 'ATAA': 5, 'AJAA': 2}

You would want to group it such that the keys are now values and values are now keys:

{2: ['AGAA', 'AJAA'], 5: ['ATAA']}

After grouping the values, you can use max to determine the largest grouping.

Example:

from collections import defaultdict

data = {'AGAA': 2, 'ATAA': 5, 'AJAA': 2}

grouped = defaultdict(list)
for key in data:
    grouped[data[key]].append(key)

max_group = max(grouped.values(), key=len)
print(max_group)

Outputs:

['AGAA', 'AJAA']

You could also find the max key and print it that way:

max_key = max(grouped, key=lambda k: len(grouped[k]))
print(grouped[max_key])
flakes
  • 21,558
  • 8
  • 41
  • 88
  • 1
    This worked flawlessly. Great explanation and effecient code. I've posted a meta-question on ThankYou comments and my return answer was, 'noonono'. I couldn't help it. – Buddy Bob May 09 '21 at 22:21
0

You can try this:

from collections import Counter

d = {'AGAA': 2, 'ATAA': 5,'AJAA':2}

l = Counter(d.values())
l = [x for x,y in l.items() if y > 1]

out = [x for x,y in d.items() if y in l]
# Out[21]: ['AGAA', 'AJAA']
Andreas
  • 8,694
  • 3
  • 14
  • 38