Categories
Linux Tutorial Series

Linux Tutorial Series – 70 – The diff command

Here is the video version, if you prefer it:

The diff command is used to see the difference between two files. (Shotts, 2019)⁠ I don’t think you will be using this command much if you are not a software developer, but if you are a software developer, you could use it to look at the difference of two program source files, or to compare two files which were outputted by different programs.

Here is an example of using the diff command:

mislav@mislavovo-racunalo:~/Linux_folder$ cat aba.txt

Abba Money

Money Money

It's the Money

In the Rich Man's World

mislav@mislavovo-racunalo:~/Linux_folder$ cat ab.txt

AB

Ab

aB

ab

mislav@mislavovo-racunalo:~/Linux_folder$ diff aba.txt ab.txt

1,5c1,4

< Abba Money

< Money Money

< It's the Money

< In the Rich Man's World

<

---

> AB

> Ab

> aB

> ab

The output seems a bit confusing. What is this 1,5c1,4? In my opinion, you don’t have to know what that means. What you need to know is that when you encounter a <, that means that the line that follows is missing from the second file (in this case ab.txt) and when you encounter a >, that means that that line is missing from the first file (in this case aba.txt). (“Understanding of diff output,” n.d.)⁠

Let’s take another example:

mislav@mislavovo-racunalo:~/Linux_folder$ cat ab.txt

AB

Ab

aB

ab

mislav@mislavovo-racunalo:~/Linux_folder$ cat ab2.txt

AB

aB

aB

ab

mislav@mislavovo-racunalo:~/Linux_folder$ diff ab.txt ab2.txt

2c2

< Ab

---

> aB

Again, as I stated, when you encounter a < that means that the line that follows is missing from the second file (in this case ab.txt) and when you encounter a > that means that that line is missing from the first file (in this case ab2.txt). They only differ in the second line, and the second file is missing Ab, while the first file is missing aB.

If you really want to know what 2c2 means, I refer you to the first answer in (“Understanding of diff output,” n.d.)⁠. I agree with the second answer in the reference, which is that you don’t need to know what this means. You will most likely be using diff rarely and when you do, you will be able to see the conflicting lines and that’s all you need to know. That is, in my experience, enough for practical purposes.

There are 2 options for diff that are useful: the -c option (also known as the context format) and the -u option (aka the unified format). Both alter the output of diff. Here is an example of diff -c:

mislav@mislavovo-racunalo:~/Linux_folder$ cat 1264.txt

1

2

6

4

mislav@mislavovo-racunalo:~/Linux_folder$ cat 2345.txt

2

3

4

5

mislav@mislavovo-racunalo:~/Linux_folder$ diff -c 1264.txt 2345.txt

*** 1264.txt 2020-02-05 23:09:02.454637685 +0100

--- 2345.txt 2020-02-05 23:06:15.415147970 +0100

***************

*** 1,4 ****

- 1

2

! 6

4

--- 1,4 ----

2

! 3

4

+ 5

Here is the meaning of the output: First of all, the *** denote the first file, while --- denote the second file. That means that *** 1,4 **** means lines 1 through 4 of the first file, while --- 1,4 ---- means lines 1 through 4 of the second file. Now for the actual file contents: a in front of a line means that a line appears in the first file, but not the second file. A + means that a line appears in the second file, but not in the first file. An ! means that a line (or lines) changed between the first and the second file. (Shotts, 2019)⁠

Here is an example with the -u option:

mislav@mislavovo-racunalo:~/Linux_folder$ cat 1264.txt

1

2

6

4

mislav@mislavovo-racunalo:~/Linux_folder$ cat 2345.txt

2

3

4

5

mislav@mislavovo-racunalo:~/Linux_folder$ diff -u 1264.txt 2345.txt

--- 1264.txt 2020-02-05 23:09:02.454637685 +0100

+++ 2345.txt 2020-02-05 23:06:15.415147970 +0100

@@ -1,4 +1,4 @@

-1

2

-6

+3

4

+5

A indicates that a line was removed from the first file and a + indicates that a line was added to the first file. This means that if we were to remove the lines with the minuses and add the lines with the pluses, we would get the contents of the second file.

That’s pretty much it. I think I have covered all you need. You won’t be using diff as much if you’re not a software developer anyway, and if you are a software developer, this is all you need to know in my opinion.

In case you are really curious: the numbers 1,5c1,4 in the first example are instructions for the patch command. Since I never used the patch command, I won’t cover it. If you ever need to use it, you know it exists and Google is your friend.

Hope you learned something new!

References

Shotts, W. (2019). The Linux Command Line, Fifth Internet Edition. Retrieved from http://linuxcommand.org/tlcl.php. Pages 319-321

Understanding of diff output. (n.d.). Retrieved February 4, 2020, from https://unix.stackexchange.com/questions/81998/understanding-of-diff-output