How do I do a one way diff in Linux?

Question

Normal behavior of diff:

Normally, diff will tell you all the differences between a two files. For example, it will tell you anything that is in file A that is not in file B, and will also tell you everything that is in file B, but not in file A. For example:

File A contains:

cat
good dog
one
two

File B contains:

cat
some garbage
one
a whole bunch of garbage
something I don't want to know

If I do a regular diff as follows:

diff A B

the output would be something like:

2c2
< good dog
---
> some garbage
4c4,5
< two
---
> a whole bunch of garbage
> something I don't want to know

What I am looking for:

What I want is just the first part, for example, I want to know everything that is in File A, but not file B. However, I want it to ignore everything that is in file B, but not in file A.

What I want is the command, or series of commands:

???? A B

that produces the output:

2c2
< good dog
4c4,5
< two

I believe a solution could be achieved by piping the output of diff into sed or awk, but I am not familiar enough with those tools to come up with a solution. I basically want to remove all lines that begin with --- and >.

Edit: I edited the example to account for multiple words on a line.

Note: This is a "sub-question" of: Determine list of non-OS packages installed on a RedHat Linux machine

Note: This is similar to, but not the same as the question asked here (e.g. not a dupe): One-way diff file

This feels like a superuser question. – Lynn Crumbling Jun 24 '14 at 15:05 — Lynn Crumbling, Jun 24 '14 at 15:05

twalberg · Answer 1 · 2014-08-20T15:08:57.470

An alternative, if your files consist of single-line entities only, and the output order doesn't matter (the question as worded is unclear on this), would be:

comm -23 <(sort A) <(sort B)

comm requires its inputs to be sorted, and the -2 means "don't show me the lines that are unique to the second file", while -3 means "don't show me the lines that are common between the two files".

If you need the "differences" to be presented in the order they occur, though, the above diff / awk solution is ok (although the grep bit isn't really necessary - it could be diff A B | awk '/^</ { $1 = ""; print }'.

EDIT: fixed which set of lines to report - I read it backwards originally...

score 7 · Answer 2 · answered Mar 30 '16 at 18:00

7

As stated in the comments, one mostly correct answer is

diff A B | grep '^<'

although this would give the output

< good dog
< two

rather than

2c2
< good dog
4c4,5
< two

answered Mar 30 '16 at 18:00

1''

26,823
32
143
200

score 5 · Accepted Answer · answered Jun 24 '14 at 15:09

5

diff A B|grep '^<'|awk '{print $2}'

grep '^<' means select rows start with <

awk '{print $2}' means select the second column

answered Jun 24 '14 at 15:09

leo108

817
5
12

2

Thank you so much, that put me on the right track. The problem with print $2 is that it ignores any words that come later (e.g. if I put "good dog" in file A vs. dog. It turns out that the first part of the command achieves what I want though, e.g. the following command: diff A B | grep '^<' – Jonathan Jun 24 '14 at 16:32
2

@Jonathan try this: diff A B|grep '^<'|cut -c 3- – leo108 Jun 25 '14 at 05:44

score 1 · Answer 4 · answered Mar 05 '21 at 15:56

1

If you want to also see the files in question, in case of diffing folders, you can use

diff public_html temp_public_html/ | grep '^[^>]'

to match all but lines starting with >

answered Mar 05 '21 at 15:56

Pasi Matalamäki

1,843
17
14

How do I do a one way diff in Linux?

4 Answers4

Linked