1

So what I'm trying to do is get two text files and to return the longest matching string in both. I put both textfiles in arraylist and seperated them by everyword. This is my code so far, but I'm just wondering how I would return the longest String and not just the first one found.

 for(int i = 0; i < file1Words.size(); i++)

{
    for(int j = 0; j < file2Words.size(); j++)
    {
        if(file1Words.get(i).equals(file2Words.get(j)))
        {
            matchingString += file1Words.get(i) + " ";
        }

    }
}
Haseeb Waseem
  • 71
  • 1
  • 3
  • Possible duplicate of [longest common substring between 2 HUGE files - out of memory: java heap space](http://stackoverflow.com/questions/23746332/longest-common-substring-between-2-huge-files-out-of-memory-java-heap-space) – Idos Dec 12 '15 at 18:08
  • It is not very clear: you talk about longest matching string, thank to @Idos link, then about strings (every word ? ), you split ?, then you concatenate , but have they to be adjacent ? It changes everything for a fast algorithm ... If it is for words, a good way is to use a Set ... – guillaume girod-vitouchkina Dec 12 '15 at 19:05

6 Answers6

2
String longest = "";
for (String s1: file1Words)
    for (String s2: file2Words)
        if (s1.length() > longest.length() && s1.equals(s2)) longest = s1;
Jeremy Goodell
  • 18,225
  • 5
  • 35
  • 52
2

if you are looking for performance in time and space,when compared to above replies, you can use below code.

System.out.println("Start time :"+System.currentTimeMillis());
 String longestMatch="";
 for(int i = 0; i < file1Words.size(); i++) {
    if(file1Words.get(i).length()>longestMatch.length()){
        for(int j = 0; j < file2Words.size(); j++) {
            String w = file1Words.get(i);
            if (w.length() > longestMatch.length() && w.equals(file2Words.get(j)))
                longestMatch = w;
        }
    }
System.out.println("End time :"+System.currentTimeMillis());
R Rajesh
  • 41
  • 5
  • Good point, no reason to iterate the second arraylist if the word isn't longer than longestMatch. Note that you could improve performance even more by breaking out of the inner loop when you set longestMatch = w. – Jeremy Goodell Dec 13 '15 at 15:51
1

I'm not going to give you the code but I'll help you with the main ides...

You will need a new string variable "curLargestString" to keep track of what is currently the largest string. Declare this outside of your for loops. Now, for every time you get two matching words, compare the size of the matching word to the size of the size of the word in "curLargestString". If the new matching word is larger, than set "curLargestString" to the new word. Then, after your for loop have run, return curLargestString.

One more note, be sure to initialize curLargestString with an empty string. This will prevent an error when you call the size function on it after you get your first matching word

bstadt
  • 146
  • 11
1

You can use following code:

String matchingString = "";
Set intersection = new HashSet(file1Words);
intersection.retainAll(file2Words)

for(String word: intersection) 
    if(word.length() > matchingString.size()) 
         matchingString = word;       
Slava Vedenin
  • 58,326
  • 13
  • 40
  • 59
1

Assuming, your files are small enough to fit in memory, sort them both with a custom comparator, that puts longer strings before shorter ones, and otherwise sorts lexicographically. Then go through both files in order, advancing only one index at a time (teh one, pointing to the "smallest" entry of two), and return the first match.

Dima
  • 39,570
  • 6
  • 44
  • 70
1
private String getLongestString(List<String> list1, List<String> list2) {
    String longestString = null;

    for (String list1String : list1) {
        if (list1String.size() > longestString.size()) {
            for (String list2String : list2) {
                if (list1String.equals(list2String)) {
                    longestString = list1String;
                }
            }
        }
    }
    return longestString;
}