Project 24 View File Differences
“I recently made some changes to a file. Can I compare the new file with a copy of the original file to recall the changes I made?”
This project shows you how to compare two files and view their differences in several ways. It shows how to make a patch file to bring an older version of the file up to date and how to merge two sets of changes. It covers the commands diff, sdiff, and patch. On related themes: Project 25 employs the command diff3 to compare two sets of changes against a common ancestor; and Project 26 shows how to sort text files and how to pick out commonalities and differences between sorted text files.
What’s the difference?
We use the diff command to compare two files, or two versions of the same file, to discover the differences between them. It reports on lines that have been added, deleted, or changed. It works on text or binary files, but we’ll stick to text files in this project; it’s simpler to illustrate the principles involved by using files that can be displayed. Differences can be reported line by line or using a comparative side-by-side view. diff can also compare entire directory structures, reporting the differences between similar files and listing those files that appear in one directory but not the other.
The diff command can create a patch file describing the differences between a newer file and an older file. Applying the patch to the older file brings it up to date, which is useful to keep other people updated without needing to redistribute a potentially large original file.
We also look at the sdiff command, which interactively merges two files and allows the user to select lines from either file to create the new file.
Let’s use diff to compare two similar files. In the following example, file index.php (you might recognize it as a PHP script) was updated from an original saved as index.old.php. diff reports on lines that have been deleted, added, and changed in the new version compared with the old version. To view the changes, simply give the two filenames as arguments to diff, but make sure that the older file is the first argument; otherwise, the senses of added and deleted are reversed.
$ diff index.old.php index.php 22d21 < <img id="photo2" class="abs" src="/images/front/body.... 52a52,53 > <a id="box-contact" class="abs" href="tails/contact... > <a id="box-who" class="abs" href="tails/who.php"></a> 61c62 < ("select who, quote from quotes where id = $quote"); --- > ("SELECT who, quote FROM quotes WHERE id = $quote");
The first line, 22d21, shows the line numbers of the old file (22) and new (21) where the first difference was detected. Letter d says that a line was deleted from the old version and so is not in the new version. The deleted text follows on the next line. Next, 52a52,53 shows that two lines were added (letter a) and displays the additions on the two subsequent lines. Finally, 61c62 indicates that line 61 was changed (letter c) and that it is now line 62 in the new file; the two lines that follow display the original line and its replacement.
Option -i tells diff to ignore changes that involve only case. Adding it to the previous example causes the changes detected in line 61 to be ignored.
$ diff -i index.old.php index.php 22d21 < <img id="photo2" class="abs" src="/images/front/body.... 52a52,53 > <a id="box-contact" class="abs" href="tails/contact... > <a id="box-who" class="abs" href="tails/who.php"></a>
Option -q (for quiet) causes diff to report whether the files differ but not show the differences.
$ diff -q index.old.php index.php Files index.old.php and index.php differ
Differences Side by Side
Diff can report file differences by displaying the two files in two columns, side by side. This visual representation is easier to understand but demands a wider terminal screen. Specify option -y to activate side-by-side mode. You might find it prudent to specify wider columns than the default 64 characters each, using option -W, and to increase the width of the terminal window accordingly. Because a printed book is fixed width, I’m going to have to specify a somewhat-reduced width of 60 (30 for each column). The GNU-style multi-letter option --suppress-common-lines tells diff to display only changes, not common lines. That has the advantage of shortening output, but at the cost of context for the changes.
$ diff -yW60 --suppress-common-lines index.old.php index.php <img id="photo2" class="ab < > <a id="box-contact" class= > <a id="box-who" class="abs ("select who, q | ("SELECT who, q
If you wish to compare two sets of similar files, diff will scan two directories, comparing files of the same name and also reporting on files that have no counterparts in the other directory. First, we’ll list the contents of directories, old and new, and then use diff to compare their contents.
$ ls new old new: index.php new-file.php old: index.php old-file.php $ diff old/ new/ diff old/index.php new/index.php 22d21 < <img id="photo2" class="abs" src="/images/front/body.... 52a52,53 ... Only in new/: new-file.php Only in old/: old-file.php
To perform a full recursive search and compare directories within directories too, specify the option -r.
Make and Apply Patches
Suppose that your friend has an outdated version of a file that’s several megabytes. Rather than send her an updated copy of the entire file, Unix lets you send her just the changes in the form of a patch, which she can apply to bring her old file up to date. Command diff is used to create a patch, and command patch applies it. Here’s an example in which we patch index.old.php to bring it up to date.
First, we capture the output from diff in a file called patchfile. It’s important to specify option -C3, which tells diff to provide context information around the changes. Context information is essential for patch to work correctly.
$ diff -C3 index.old.php index.php > patchfile
Next, send the (very small) patch to your friend, who updates her version of the file. (Her copy is probably called index.php, but we’ll continue to use index.old.php to distinguish the two versions.)
$ patch index.old.php patchfile patching file index.old.php
And to show that it worked:
$ diff index.old.php index.php
Merge with sdiff
If you are unsure which of two files contains your latest changes, or if you want to keep changes to both files in a pair, Unix provides two useful techniques. If both files are descended from a common ancestor, take a look at the diff3 command, demonstrated in Project 25. Otherwise, use sdiff to do an interactive merge of the files, in which you can choose to incorporate differences found in either file. The contents of both files are displayed in a two-column format like what we saw for diff -y.
In the next example, we merge two files into a new file called index.new.php, introduced by option -o. Option -s says to skip (not query or show) common lines, and option -w adjusts the column width. (Rather irritatingly, diff uses -W, whereas sdiff uses -w, so don’t get mixed up!)
Pressing Return at sdiff’s prompt (%) causes it to display a menu of options. In this simple case, we choose r in all cases to select the changes from index.php. Pressing l would select the changes from index.old.php. The output is written to index.new.php.
$ sdiff -s -w60 -o index.new.php index.old.php index.php <img id="photo2" class="ab < % l: use the left version r: use the right version e l: edit then use the left version e r: edit then use the right version e b: edit then use the left and right versions concatenated e: edit a new version s: silently include common lines v: verbosely include common lines q: quit %r > <a id="box-contact" class= > <a id="box-who" class="abs %r ("select who, q | ("SELECT who, q %r
And to show that it worked:
$ diff index.new.php index.php $