4

Given this sample, which finds unique elements while being available to tell the element source

source_list = ["one", "two", "three", "four", "five"]
diff_list = ["zero", "one", "two", "three", "four", "six", "seven"]

source_unique = []
diff_unique = []

for entry in source_list:
    if entry not in diff_list:
        source_unique.append(entry)

for entry in diff_list:
    if entry not in source_list:
        diff_unique.append(entry)

print("Unique elements in source_list: {0}".format(source_unique))
print("Unique elements in diff_list: {0}".format(diff_unique))

###
# Unique elements in source_list: ['five']
# Unique elements in diff_list: ['zero', 'six', 'seven']

is there a more efficient way to do this instead of using two additional lists and all that stuff? The main task is to be able to tell the elements' origin.

Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253

3 Answers3

4

By using sets and taking their difference which has a complexity of O(len(set_object)):

>>> s1, s2 = set(source_list), set(diff_list)
>>> s1.difference(s2)
{'five'}
>>> s2.difference(s1)
{'seven', 'six', 'zero'}

Which can also be written as:

>>> s1 - s2 
{'five'}
>>> s2 - s1
{'seven', 'six', 'zero'}    

in this case, you might need to transform to a list afterwards, if necessary with list(s1 - s2) and list(s2 - s1) accordingly.

Or, you could do the same thing by using a list comprehension and making source_list and diff_list sets for fast membership testing with the in operator:

For the uniques list:

source_unique = [v1 for v1 in source_list if v1 not in set(diff_list)]
source_unique
['five']

For the diff_unique list:

diff_unique = [v1 for v1 in diff_list if v1 not in set(source_list)]
diff_unique
['zero', 'six', 'seven']

Which is again O(len(list)) unless I'm mistaking my time complexities.

Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253
1

You can use sets:

source_list = ["one", "two", "three", "four", "five"]
diff_list = ["zero", "one", "two", "three", "four", "six", "seven"]


print("Unique elements in source_list: {0}".format(set(source_list)-set(diff_list)))
print("Unique elements in diff_list: {0}".format(set(diff_list)-set(source_list)))

Printing out:

Unique elements in source_list: set(['five'])
Unique elements in diff_list: set(['seven', 'six', 'zero'])
Heval
  • 338
  • 3
  • 11
  • 1
    Oh, this looks super easy using sets! I've found the same question, but just about the difference. Didn't figure out that I am able to find the set difference in both directions! Thanks. –  Aug 07 '16 at 16:20
  • I'm glad it helped :) – Heval Aug 07 '16 at 16:24
  • But I'll accept Jim's answer (if you don't mind) just for completeness of the contents, in case other users encounter the same issue. –  Aug 07 '16 at 16:27
0

You can use sets to do it

source_list = ["one", "two", "three", "four", "five"]
diff_list = ["zero", "one", "two", "three", "four", "six", "seven"]

source_unique = list(set(source_list) -  set(diff_list))
diff_unique =  list(set(diff_list) -  set(source_list))
Prasanna
  • 4,125
  • 18
  • 41