-2

I have two lists. One is a list of selected punctuations and the other a list of tokens.

punc = ['.', '!', '?']

tokens = ['today', 'i', 'went', 'to', 'the', 'park', '.', 'it', 'was', 'great', '!']

How do I get the index of the first punctuation (as defined by the list punc) that appears in tokens?

In the above case, my desired output is index = 6 since the first punctuation that appears is '.'.

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
jo_
  • 677
  • 2
  • 11

2 Answers2

2

The solution to your problem would be this

punc = ['.', '!', '?']

tokens = ['today', 'i', 'went', 'to', 'the', 'park', '.', 'it', 'was', 'great', '!']

for i, element in enumerate(tokens):
    if element in punc:
        print(f"Found {element} at index: {i}")
        break

What we do here is we loop over tokens using enumerate, which returns the index and the element. For each iteration in the loop we check if the element is in "punc" if that's the case you have found your first element.

Kiraged
  • 519
  • 2
  • 9
  • 28
2

You can do it like this with index() on the tokens list:

punc = ['.', '!', '?']

tokens = ['today', 'i', 'went', 'to', 'the', 'park', '.', 'it', 'was', 'great', '!']

for p in punc:
    if p in tokens:
        print(p, tokens.index(p), sep=" index is: ")
    else:
        print(p, 'not found', sep=' ')

This code will print all the punc index in tokens, if exists.

With list comprehension:

[print(p, tokens.index(p), sep=" index is: ") if p in tokens else print(p, 'not found', sep=' ') for p in punc]

Output:

. index is: 6
! index is: 10
? not found

In case you just want to check the first item and not the entire punc list:

print(tokens.index(punc[0]) if punc[0] in tokens else 'not found')

OUTPUT:

6

The usage of [index()] can generate a ValueError exception when the element is not in the list:

Exception has occurred: ValueError
'?' is not in list

In you case this can happend for the value ? that is not present in tokens.

To solve this you have two simple ways:

  • Check if the item is in list like: '?' in tokens (This is the clean/redable approach)
  • Wrap the .index() call inside a try/except and manage it. (This is the fast approach)
Carlo Zanocco
  • 1,967
  • 4
  • 18
  • 31