3

I need to find if my tuple contains only None value.

I use this code but i not sure it's the good practice:

# coding=utf8

def isOnlyNoneValuesTuple(t):
    """
    test if tuple contains only None values
    """
    if not len(tuple(itertools.ifilter(None, t))):
        return True
    else:
        return False

print isOnlyNoneValuesTuple((None,None,None,None,None,None,None,None,None,None,None,None,))
print isOnlyNoneValuesTuple((None,None,None,"val",None,None,None,None,))

Do you know other good practice to test it?

Thank you for your opinions

GeoStoneMarten
  • 533
  • 1
  • 7
  • 19

4 Answers4

16
return all(item is None for item in t)
BrenBarn
  • 242,874
  • 37
  • 412
  • 384
4
return set(t) == {None}

Although I guess I'd use all() in practice.

RemcoGerlich
  • 30,470
  • 6
  • 61
  • 79
  • 4
    Agreed—besides being nice code golf, it may be useful to help understanding for someone who has a clearer idea of what sets mean than how genexprs work… but the fact that it doesn't short-circuit, wastes memory for input with lots of distinct values, and fails if any of the elements aren't hashable means I probably wouldn't use it in practice either. – abarnert Jan 22 '14 at 21:37
  • 2
    And on top of that, it should probably return True for the empty tuple and this doesn't. But showing an alternative can't hurt :-) – RemcoGerlich Jan 22 '14 at 21:40
  • Ah, didn't think of that problem. – abarnert Jan 22 '14 at 21:52
  • It work but `all` is better isnt'it? I'm parsing table with cursor and ignore row with only None, 0 or empty string. – GeoStoneMarten Jan 22 '14 at 21:56
  • 2
    Yes, all is better. If you just want to check if there is any thing in the tuple that evaluates as True (None, 0 and "" are all Falsy) then it becomes as simple as `if any(t)` – RemcoGerlich Jan 22 '14 at 22:04
  • _all_ is better because _set_ creation may be an expensive operation - especially for a big iterable – volcano Jan 22 '14 at 22:19
4
return t.count(None) == len(t)

and it is faster than using all:

>>> setup = 't = [None, None]*100; t[1] = 1'
>>> timeit.timeit('all(item is None for item in t)', setup=setup)
0.8577961921691895
>>> timeit.timeit('t.count(None) == len(t)', setup=setup)
0.6855478286743164

and speed of all decreases according to index of not None element:

>>> setup = 't = [None, None]*100; t[100] = 1'
>>> timeit.timeit('all(item is None for item in t)', setup=setup)
8.18800687789917
>>> timeit.timeit('t.count(None) == len(t)', setup=setup)
0.698199987411499

BUT with big lists all is faster:

>>> setup = 't = [None, None]*10000; t[100] = 1'
>>> timeit.timeit('t.count(None) == len(t)', setup=setup)
47.24849891662598
>>> timeit.timeit('all(item is None for item in t)', setup=setup)
8.114514112472534

BUT not always though:

>>> setup = 't = [None, None]*10000; t[1000]=1'
>>> timeit.timeit('t.count(None) == len(t)', setup=setup)
47.475088119506836
>>> timeit.timeit('all(item is None for item in t)', setup=setup)
72.77452898025513

Conclusion that i make for myself about speed of all or count - very depends of data. If probability that you have all None in very big list - dont use all, it is very slow in that case.

ndpu
  • 22,225
  • 6
  • 54
  • 69
  • 1
    Except that it's **much slower** if there are any non-`None` values, because it doesn't short-circuit. – abarnert Jan 22 '14 at 21:53
  • I move table to table with insertcursor containing between 10,000 and 3 million lines. and there is no many empty lines. I think it's between 10 and 2000 lines (thanks for MS Excel ...) – GeoStoneMarten Jan 22 '14 at 22:03
  • Try it with N=10000 or N=3000000 instead of N=100 and see which one's faster. With 10000, I get 549ns/loop with `all`, and 69.4us/loop (>100 times slower) with `count`. – abarnert Jan 22 '14 at 22:26
  • @abarnert on N=1000 `count` is still 2x faster and for N=10000 nearly 6x slower than `all` with t[100]=1... that strange – ndpu Jan 22 '14 at 22:35
  • @ndpu: For N=1000 with the first non-`None` at 100, I get 5.01us vs. 6.89us, which is slower, not 2x faster. But again, who cares whether the crossover point is 1000 or 2000, when the OP has told us that his domain is anywhere from 10000 and 3000000? – abarnert Jan 22 '14 at 22:38
2

I second BrenBarn's answer, it's very Pythonic. But since you also asked about testing the code, I'll add my two cents.

You can create appropriate data structures by making use of the fact that Python allows you to multiply a list containing a single element by a factor n to get a list that contains the same element n times:

print(isOnlyNoneValuesTuple(tuple(10 * [None])))
print(isOnlyNoneValuesTuple(tuple(3 * [None] + ["val"] + 4 * [None])))
itsjeyd
  • 5,070
  • 2
  • 30
  • 49
  • You can multiply tuples too, `tuple(10 * [None])` is the same as `10 * (None,)` (but mind that last comma) – RemcoGerlich Jan 22 '14 at 22:06
  • Thanks but it's not my question. The tuple is result of row cursor. – GeoStoneMarten Jan 22 '14 at 22:08
  • @RemcoGerlich Thanks for mentioning that! I won't add that to my answer though, as I think it's nice to have both versions mentioned here :) – itsjeyd Jan 22 '14 at 22:10
  • @GeoStoneMarten Then what do you mean by "Do you know other good practice *to test it*"? In any case, testing your code is always a good idea ;) And it's common to do so with dummy values... – itsjeyd Jan 22 '14 at 22:13
  • @ndpu explained several cases of good practice. He understood what i mean. Thanks – GeoStoneMarten Jan 23 '14 at 05:38