I have a list of words stored in a list:
[
'investment',
'property',
'something',
'else',
'vest'
]
I also have a list of strings, like so
[
'investmentproperty',
'investmentsomethingproperty',
'investmentsomethingelseproperty',
'abcinvestmentproperty',
'investmentabcproperty'
]
Given this list of words and the list of strings, I need to determine which strings contain only words from the list of words, and have a maximum number of these words.
In the above example, if the maximum number of words was 3, then only the first two items from the string list would match (even though the word 'vest' is in 'investment'.
This example simplifies the word list and string list - in reality there are many thousands of words and hundreds of thousands of strings. So this needs to be performant. All the strings contain no spaces.
I've tried constructing a regex like so:
^(?:(word1)|(word2)|(word3)){1,3}$
but this is veeery slow for the number of words in the word list (currently 10,000).
Thanks