This is a somewhat poorly-defined problem. I can think of several non-equivalent definitions of list "subtraction," two of which are already represented: truncating (via slicing) -- a true inverse of concatenation; and filtering, which resembles the definition of "subtraction" (really relative complementation) for sets. For filtering, using a list comprehension over a
with b
converted to a set is the best approach. (I.e. larsvegas's answer.)
But one version that hasn't been considered is the multiset definition of subtraction. Python's Counter
type provides us with a multiset:
>>> from collections import Counter
>>> a = [3, 4, 5]
>>> b = [4, 5]
>>> a_counter = Counter(a)
>>> b_counter = Counter(b)
>>> a_counter
Counter({3: 1, 4: 1, 5: 1})
>>> b_counter
Counter({4: 1, 5: 1})
>>> a_counter - b_counter
Counter({3: 1})
Of course, this doesn't preserve order, but we can fix that by filtering a
based on the result of a_counter - b_counter
:
def subtract_lists(a, b):
multiset_difference = Counter(a) - Counter(b)
result = []
for i in a:
if i in multiset_difference:
result.append(i)
multiset_difference -= Counter((i,))
return result
This has several nice properties. It preserves order; it functions as a true inverse of concatenation; it implements an intuitively consistent version of subtraction on a datatype that can contain duplicates; and it works in linear time.
>>> subtract_lists(a, b)
[3]
>>> subtract_lists([1, 2, 3, 4], [2, 3, 4])
[1]
>>> subtract_lists([1, 2, 3, 4], [2, 4])
[1, 3]
>>> subtract_lists([1, 2, 3, 4, 4, 4], [2, 4])
[1, 3, 4, 4]