1

alright, this question is hard to phrase so I will try to just be as detailed as possible. I hope you guys can follow me.

let's say I have 5 words, A, B, C, D, E and each of them has some synonyms associated with it:

A = [1,2,3,4]
B = [1,5,6,7]
C = [1,9,10,11]
D = [12,5,14,15]
E = [99,99,99,5]

I now want to find the minimum amount of synonyms for all words. So for example, 1 would describe A, B, C, and 5 would describe B, D, E.

so a possible solution could be [1,5]. It is important to note that in this case, B would be described twice and that is okay.

right now I use itertools.product to get all possible combinations, then I just map(set, combinations) and use a dictionary to get back the number of times a word is described. so for the list above I would, for example, get back:

[1,5],{"A":1,"B":2,"C":1,"D":1,"E":1} (and then the same for every other possible combination, but i just take the one with the shortest list of numbers)

the thing is, this quickly explodes when more letters are added. In my real example i am dealing with 20 letters (A,B,C...) and each of them has 20 "synonyms", so making all possible combinations is just impossible.

is there a way to speed this up? thank you

david
  • 97
  • 4
  • You asked *"is there a way to speed this up?"* The accepted answer in the linked Q&A should be faster than your `itertools.product` approach, as it tries them in order of size, so it can stop as soon as it finds one hitting set (instead of finding all of them and then choosing the smallest one). Compare the running time first, and if the linked solution is still not fast enough, read the linked article about the Hitting Set problem (which yours is an instance of). – kaya3 Jun 30 '20 at 16:42
  • thanks for your help i misunderstood somehting there – david Jun 30 '20 at 16:46

0 Answers0