1

I have thousands of writings in plain text format moved to a single directory.

In the titles, some have spaces, some start with -, some have single/double quotes, & basically every other valid Windows & Linux filename character is in the titles.

The content text contains Windows & Linux line endings(right - that's what they're called?).

In Linux/Bash, how do I concatenate all these files ((half are extension-less, half are .txt's)) into one file, sorted by modification date, with filename & file date neatly printed before each file's content?

If you could, please tell me how to do the same thing in a nested file structure, too, this time with the file paths printed for each file, besides filename & file modification date.

I would appreciate this greatly, this is for years of my very own writing, & I've been searching & struggling for a few hours now. I'm a writer not a programmer =)

Thanks for considering.

lakitu
  • 23
  • 3

2 Answers2

1

If you have some GNU goodies and dos2unix:

find -type f -printf "%T@ %p\0" | sort -nz | while IFS= read -r -d '' l; do f=${l#* }; printf '%s %s\n' "$(date -r "$f")" "$f"; dos2unix < "$f"; echo; done

Should do the job and be 100% safe regarding all the funny filenames you might have. Works recursively. Sorry for the long one-liner but it's bedtime!


Edit. Regarding your .fuse_hidden_blahblah file: I have no idea why this file is here, why some content is recursively being added to itself. I'm sure you can safely ignore it by asking find to explicitly ignore it:

find \! -name '.fuse_hidden*' -type f -printf "%T@ %p\0" | sort -nz | while IFS= read -r -d '' l; do f=${l#* }; printf '%s %s\n' "$(date -r "$f")" "$f"; dos2unix < "$f"; echo; done

By the way, the content is displayed on the terminal screen. If you want to redirect it into a file mycatedfile.txt, then:

find \! -name 'mycatedfile.txt' \! -name '.fuse_hidden*' -type f -printf "%T@ %p\0" | sort -nz | while IFS= read -r -d '' l; do f=${l#* }; printf '%s %s\n' "$(date -r "$f")" "$f"; dos2unix < "$f"; echo; done > "mycatedfile.txt"
gniourf_gniourf
  • 44,650
  • 9
  • 93
  • 104
  • am i doing this wrong? it seems to indefinitely loop. – lakitu Feb 16 '15 at 03:53
  • oh - it is including the ".fuse_hiddenblabahblahblah" temp file of itself in itself. that's what's up. correct that / tell me what i'm doing wrong & i'll give this best answer. thanks! – lakitu Feb 16 '15 at 04:01
  • @lakitu: I have no idea why this happens. I've edited the post to include a way to ignore these files. Hope this helps. – gniourf_gniourf Feb 16 '15 at 13:50
  • @gniourf_gnurf: for whatever reason, the last version wasn't showing up for me, but when i just manually pipe it to a file (adding ">> /file/path/here/filenamehere.txt" after the script invocation), it works beautifully. i might've just typed it wrong, i manually copied it. – lakitu Feb 17 '15 at 08:04
  • in any case i'm awarding this best answer – lakitu Feb 17 '15 at 08:04
0

Using this fantastic answer (to avoid things like parsing ls output) gets you something like this (for a single directory):

sorthelper=();
for file in *; do
    # We need something that can easily be sorted.
    # Here, we use "<date><filename>".
    # Note that this works with any special characters in filenames

    sorthelper+=("$(stat -n -f "%Sm%N" -t "%Y%m%d%H%M%S" -- "$file")"); # Mac OS X only
    # or
    sorthelper+=("$(stat --printf "%Y    %n" -- "$file")"); # Linux only
done;

sorted=();
while read -d $'\0' elem; do
    # this strips away the first 14 characters (<date>) 
    sorted+=("${elem:14}");
done < <(printf '%s\0' "${sorthelper[@]}" | sort -z)

for file in "${sorted[@]}"; do
    if [ -f "$file" ]; then
        echo "$file";
        cat "$file";
    fi
done; > Output.txt

For a nested hierarchy use for file in **; do in shells that support that (bash version 4+ and zsh that I'm aware of) or put the above into a function and call it recursively on directories in the loop (below code entirely untested).

catall() {
    declare sorthelper=();
    for file in *; do
        # We need something that can easily be sorted.
        # Here, we use "<date><filename>".
        # Note that this works with any special characters in filenames

        sorthelper+=("$(stat -n -f "%Sm%N" -t "%Y%m%d%H%M%S" -- "$file")"); # Mac OS X only
        # or
        sorthelper+=("$(stat --printf "%Y    %n" -- "$file")"); # Linux only
    done;

    declare sorted=();
    while read -d $'\0' elem; do
        # this strips away the first 14 characters (<date>) 
        sorted+=("${elem:14}");
    done < <(printf '%s\0' "${sorthelper[@]}" | sort -z)

    for file in "${sorted[@]}"; do
        if [ -f "$file" ]; then
            echo "$file";
            cat "$file";
        elif [ -d "$file" ]; then
            catall "$file"
        fi
    done;
}

$ catall > Output.txt

Edit: As noticed in gniourf_gniourf's excellent answer I failed to account for the varied line endings in your input files. Using dos2unix <"$file" instead of cat "$file" in the above should normalize as was indicated.

Edit again: Hm... just noticed that this doesn't include the modification times in the output. The simplest way to get that into the output is also the costliest (fetch it again at output time) but a solution like what is employed in gniourf_gniourf's answer will work here as well (drop the sorthelper to sorted loop and use the timestamp in the final loop to write it to the file).

Community
  • 1
  • 1
Etan Reisner
  • 77,877
  • 8
  • 106
  • 148
  • you guys are great, let me get this onto my (offline) writing computer. thank you much, will be a bit before i can give a confirmed working "best answer." – lakitu Feb 16 '15 at 00:13
  • hey Etan - i copied this but i am getting 'stat: invalid option -- 'n'' as a error in the loop (it is repeated over & over) - is there a workaround if my stat does not have -n as an option? – lakitu Feb 16 '15 at 00:49
  • (i just checked, i have stat version 8.21, if that helps) – lakitu Feb 16 '15 at 00:53
  • Look at the comments on those two lines. They are alternate `stat` call choices. – Etan Reisner Feb 16 '15 at 02:17
  • thank you for the effort Mr Reisner. is appreciated. – lakitu Feb 17 '15 at 08:05