4

I have the output of a checksum used in a unix shell script, and I need only the checksum value and the filename to be displayed.

$ Cksum path/path2/f1.txt | awk '{print $1,$2}'
1237668 path/path2/f1.txt 

However I want the filename without the directory:

1237668 f1.txt 

I have tried sed by which I only get the filename and not the checksum:

$ Cksum path/path2/f1.txt | sed 's/.*path2//'
/f1.txt
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
melony_r
  • 177
  • 1
  • 1
  • 8

6 Answers6

2

Assuming your filenames don't contain spaces, here are are sed and awk solution:

A simpler sed:

cksum path/to/f.txt | sed 's/ .*[/ ]/ /'

878395353 f.txt

This sed starts match from space character and matches until it gets last / or space since .* is greedy. We just replace this matched text with a single space.


Or a simpler awk using / or space as input field separator:

cksum path/to/f.txt | awk -F '[ /]' '{print $1, $NF}'

878395353 f.txt
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • As an aside; with sed and the substitution command, normally the delimiter used is the forward slash `/` but can be changed to another character if it makes the regex/replacement easier (as you have chosen `~`). However, in this case since the `/` is inside the bracket expression, there is no need to change the delimiter, so `sed 's/ .*[/ ]/ /' file` works as well and perhaps is easier on the eye? – potong Aug 18 '21 at 08:41
  • Thank you. I did use this . Cksum path1/path2/filename.txt | awk 'BEGIN{FS="/";} { print $1, $NF} and I get now checksumvalue size filename.txt. I want to get rid of the size, but unable to. – melony_r Sep 06 '21 at 12:14
  • I have used cksum path/path2/f1.txt | awk 'BEGIN {print $1,$NF}| awk 'print$1, $3}' AND get the required output..works – melony_r Sep 06 '21 at 12:36
  • You don't need to use 2 awk actually. It can be done in a single awk – anubhava Sep 06 '21 at 12:51
0

Based on man cksum

The cksum utility writes to the standard output three whitespace separated fields for each input file. These fields are a checksum CRC, the total number of octets in the file and the file name.

Using sed you could use 3 capture groups and use group 1 and 3 in the replacement.

([^[:space:]]+) [^[:space:]]+ (.*/)?([^[:space:]]+)

Explanation

  • ([^[:space:]]+) Group 1, match 1+ non whitespace chars
  • [^[:space:]]+ Match 1+ chars other than a whitespace char between spaces
  • (.*/)? Optionally match group 2 matching until the last occurrence of /
  • ([^[:space:]]+) Group 3, match 1+ non whitespace chars

For example:

cksum ./file.txt # --> 3777026118 8 ./file.txt

Using sed

cksum ./file.txt | sed -E 's~([^[:space:]]+) [^[:space:]]+ (.*/)?([^[:space:]]+)~\1 \3~' 

Output

3777026118 file.txt

Using awk printing the first field and the last item from the result of splitting the 3rd field:

cksum ./file.txt | awk '
{
  n=split($3,a,"/")
  print $1, a[n]
}'

Output

3777026118 file.txt
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • The Perl extension `\S` might not be available in `sed` even with `-E`. Anyway, simply `'s%[^[:space:]]+/%%'` should work portably, assuming you don't have an file names with spaces in them (which this answer already seems to assume). – tripleee Aug 17 '21 at 12:42
  • @tripleee I see, thanks for pointing that out. I tested this on my Mac and it did not work with the `\S`. Now it does on Ubuntu and Mac using `[^[:space:]]+` – The fourth bird Aug 17 '21 at 12:50
  • 1
    Still a bit complex for my taste, but thanks for the fix. The `/g` is still superfluous; you don't expect (and could not have) more than one match per line. – tripleee Aug 17 '21 at 13:02
0

Awk can do this. For example:

awk '{ printf("%d", $1); n=split($2,a,"/"); print(" ", a[n])}'

Tested with:

echo "1237668 path/path2/f1.txt"  | awk  '{ printf("%d", $1); n=split($2,a,"/"); print(" ", a[n])}'

1237668  f1.txt

The first element is just a print with $1 in the printf. The second is a split on / then print the last element:

How to split a delimited string into an array in awk?

how to access last index of array from split function inside awk?

shift
  • 76
  • 4
0

Note: I do not know the Cksum command you use. My cksum outputs 3 fields: checksum, size and filename. Adapt the indexes in the following if yours behaves differently.

If your shell is bash, you could use a bash array and basename. If you don't have spaces in your filenames:

$ a=($(cksum path/path2/f1.txt))
$ printf '%s %s\n' "${a[0]}" "$(basename ${a[2]})"
857691210 f1.txt

If you have spaces in your filenames, adapt the printf parameters:

$ a=($(cksum "path/path2/f 1.txt"))
$ printf '%s %s\n' "${a[0]}" "$(basename "${a[*]:2}")"
857691210 f 1.txt

And if you prefer quoting the filename:

$ printf '%s "%s"\n' "${a[0]}" "$(basename "${a[*]:2}")"
857691210 "f 1.txt"
Renaud Pacalet
  • 25,260
  • 3
  • 34
  • 51
0

You can also use:

cd $(dirname path/path2/f1.txt); cksum $(basename path/path2/f1.txt)

or, to keep the current directory the same:

a=$(pwd); cd $(dirname path/path2/f1.txt); cksum $(basename path/path2/f1.txt); cd $a
Luuk
  • 12,245
  • 5
  • 22
  • 33
0

Assuming the / character only occurs in field #2, delete everything after field #2; then remove the directory name:

 Cksum path/path2/f1.txt | 
 sed 's# [^/]*$##;s# .*/# #'
agc
  • 7,973
  • 2
  • 29
  • 50