I have bunch of strings which come in pairs For example, these are three pairs. First one has two string "a", "1"
a,1
a,2
b,1
In my project, i need to compress this data. For the above data, compressed data is. (compressed to back uncompressed is cartesian product)
{a} <=> {1,2}
{b} <=> {1}
What is the best algorithm to come up with the optimal compressed data. Optimal here is fewer number of mappings I could write a simple script which creates a hash table with the first string. Even though it gives optimal compressed data in some cases (above case) but it doesnt always. For example,
a,1
a,2
b,1
c,2
Hash table with first string as a key would give me below compressed data
{a} <=> {1,2}
{b} <=> {1}
{c} <=> {2}
But that's not optimal compressed data. I will get below compression if i used second string
{a,b} <=> {1}
{a,c} <=> {2}
What is the best algorithm to compress the data?