2

I am trying to find a way to look up a single word or multiple words from a list. The user inputs the word(s), and information about that item is fetched from the ItemList, based on the name.

For Example:

PriceList[0].name="Black Sheep"
PriceList[1].name="Black Horse"
PriceList[2].name="White Horse"
PriceList[3].name="White Sheep"

are some items in the list, where PriceList is an ItemList, that looks like:

public class ItemList
{
public int amount { get; set; }
public string name { get; set; }
public int buyprice { get; set; }
public int sellprice { get; set; }
public int stock { get; set; }
}

This is what I want my code to do:

  • Case 1: User asks for "Black" : Return indices 0, 1.
  • Case 2: User asks for "White" : Return 2, 3.
  • Case 3: User asks for "Horse" : Return 1, 2.
  • Case 4: User asks for "Sheep" : Return 0, 3.
  • Case 5: User asks for "Black Horse" : Return 1.
  • Case 6: User asks for "White Horse" : Return 2.
  • Case 7: User asks for "Whit Horse" : Return 2 but not 1.
  • Case 8: User asks for "Red Horse" : Return 1, 2.

etc.

I currently have:

int nickindex = PriceList.FindIndex(x => x.name.Split().Contains(typeToAdd));

where typeToAdd is the user input string.

However, this only returns one index, and it fails for cases 5 and up.

How can I loop over all indices to find them? I also need to be able to search for phrases rather than words. Lastly, I need to search within the words if no match was found (Case 7)

I have looked at Algorithm to find keywords and keyphrases in a string but it doesn't help me much.

Any help would be appreciated. Thank you.

Community
  • 1
  • 1
marisoy
  • 55
  • 1
  • 6
  • 3
    Are you sure your examples are correct? Cases 6,7, and 8 seem inconsistent to me – Jonas Høgh Mar 27 '14 at 08:14
  • Case 6: If there's a phrase that matches exactly the user's input, return only that but not separate word matches. Case 7: If no exact phrase is found, do word-matching (returns 1, 2). If no exact word is found for a word, search within the words and filter the results. The final result returns the index of "white horse" when "whit horse" is the input, and excludes the "black horse". Case 8: Same as case 7, but since "red" is not found anywhere, as a word or piece of a word, return the phrases that contain the word "horse". e.g. "did you mean one of these?" Hope this clears it up a bit. – marisoy Mar 27 '14 at 20:41

1 Answers1

3

You can use the overload of Select which gives you the index to initialize an anonymous type:

string[] words = "Black Horse".Split();
IEnumerable<int> indices = PriceList
    .Select((pl, index) => new { pl, index })
    .Where(x => words.Intersect(x.pl.name.Split()).Any())
    .Select(x => x.index);

I'm using Enumerable.Intersect to check if one of the words in the input string matches one of the word in the name.

If you want to order descending by the number of matches:

IEnumerable<int> indices = PriceList
    .Select((pl, index) => new 
    { 
        pl, 
        index, 
        matches = words.Intersect(pl.name.Split()).Count()
    })
    .Where(x => x.matches > 0)
    .OrderByDescending(x => x.matches)
    .Select(x => x.index);

However, this doesn't cover your last cases since it does not compare similarity of the words. You could use Levenshtein algorithm for this. Your rules are also not that clear on 6-8.

Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • I imagine you could do an order by the number of words that match, and then select the top ranked ones out of that? Still wouldn't work for 7 but... – Jeff Mar 27 '14 at 08:17
  • @Jeff: i have edited my answer to show a way to order the result by the number of matches. – Tim Schmelter Mar 27 '14 at 08:21
  • Thank you very much. This is what I was looking for, and it works for most of the cases. I will look at the Levenshtein algorithm for distances. – marisoy Mar 28 '14 at 20:13