I have a bunch of text files that were encoded in UTF-8
. The text inside the files look like this: \x6c\x69b/\x62\x2f\x6d\x69nd/m\x61x\x2e\x70h\x70
.
I've copied all these text files and placed them into a directory /convert/
.
I need to read each file and convert the encoded literals into characters, then save the file. filename.converted.txt
What would be the smartest approach to do this? What can I do to convert to the new text? Is there a function for handling Unicode text to convert between the literal to character types? Should I be using a different programming language for this?
This is what I have at the moment:
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
public class decode {
public static void main(String args[]) {
File directory = new File("C:/convert/");
String[] files = directory.list();
boolean success = false;
for (String file : files) {
System.out.println("Processing \"" + file + "\"");
//TODO read each file and convert them into characters
success = true;
if (success) {
System.out.println("Successfully converted \"" + file + "\"");
} else {
System.out.println("Failed to convert \"" + file + "\"");
}
//save file
if (success) {
try {
FileWriter open = new FileWriter("C:/convert/" + file + ".converted.txt");
BufferedWriter write = new BufferedWriter(open);
write.write("TODO: write converted text into file");
write.close();
System.out.println("Successfully saved \"" + file + "\" conversion.");
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
}