Aug 31 2010

Diffpdf Windows version

Introduce

DiffPDF is used to compare two PDF files—textually or visually, is developed by Mark Summerfield, he said “In theory DiffPDF could be built on Windows, but I haven’t managed to do it, so no Windows executable is available. The problem is that I don’t know how to build the Poppler library on Windows (apparently it isn’t easy!).” and I did some effort to compile to on windows. As I know it works on Windows XP and Windows 7 (both 32-bit and 64-bit Windows).

DiffPDF can compare two PDF files. It offers two comparison modes: Text and Appearance.

By default the comparison is of the text on each pair of pages, but comparing the appearance of pages is also supported (for example, if a diagram is changed or if a paragraph is reformatted). It is also possible to compare particular pages or page ranges. For example, if there are two versions of a PDF file, one with pages 1-12 and the other with pages 1-13 because of an extra page having been added as page 4, they can be compared by specifying two page ranges, 1-12 for the first and 1-3, 5-13 for the second. This will make DiffPDF compare pages in the pairs (1, 1), (2, 2), (3, 3), (4, 5), (5, 6), and so on, to (12, 13).

A couple of example PDF files are provided (boson1.pdf and boson2.pdf) so that you can try it out. PDF files can be loaded from the GUI (by pressing the File #1 and File #2 buttons), or by specifying them on the command line. I use the tool regularly to compare different versions of my books (which are typically 500 or more pages), e.g., comparing a first printing with a second printing, to make sure that only the pages I intended to change have actually been changed.

DiffPDF is licensed under the GNU General Public License v 2 open source license. To build it you will need a modern C++ compiler, the Qt 4 libraries (at least Qt 4.4; 4.6 or later recommended), and the Poppler library—all these should be available as standard packages on most Linux and BSD systems. The source is diffpdf-1.8.0.tar.gz (40K). Building follows the standard pattern for Qt 4 applications and is explained in the README. Fedora, Debian, and Ubuntu users don’t have to build it manually, they can simply install the binary diffpdf package using their prefered package management tool—but make sure you get at least version 1.8.0!

Screen Shot

Awards

SOFTPEDIA "100% FREE" AWARD

Download

 

It is open source software, but any donation is welcome.

Change log

1.8.0

- Introduced zoning: this can be slow in Characters mode but can provide better accuracy in text modes.

1.7.1

- Improved Character mode to be as liberal about hyphens as Words mode.

- Minor doc changes that I forgot for 1.7.0.

- Minor GUI bug fixes and changes.

- Minor under the hood efficiency improvements.

1.7.0

- Renamed Text mode to Words mode. This mode is best for alphabetic text   (e.g., English).
- Added Characters mode. This mode is best for logographic text (e.g.,   Chinese and Japanese). This was suggested by Paul Howarth.

1.6.3

- Very minor cosmetic changes.

1.6.2

- Save button is now only enabled if there are changes.
- An improved help window with slightly more information.

1.6.1

- Subtle bugfix that caused inaccurate highlight positioning in rare  cases.

1.6.0

- In addition to Appearance comparisons using highlighting, they can now
also some composition modes to help make subtle differences more
visible. (This was inspired by feedback by Florian Heiderich.)
- Can now control the square size for appearance comparisons. Using very
small squares can help reveal tiny Appearance differences.
- Can now control the fill opacity for highlighting.
- Extended zoom range from 20% to 800% (was 25% to 400%).

1.5.1

Quite a few false-positives have now been eliminated:
- All hyphens are treated the same now.
- Some weirdly-encoded open/close double-quotes are treated as normal
Unicode open/close double-quotes.
- Improved whitespace ignoring.

1.5.0

Added Save As action and dialog for saving the differences to a PDF file.

1.3.0

Added support for -a, –appearance, -t, –text command line options to
set the initial comparison mode.

1.2.3

Added a mention of comparepdf in the About box.

1.2.2
Tiny change in how command line arguments are handled. This should make things work with both English and non-English locales on Windows.
1.2.1
Minor improvement with focus control for Previous and Next buttons.
Acknowledged Steven Lee for building Windows binaries and Dirk Loss for building Mac OS X binaries in the About box.

1.2.0

Got rid of old text comparison mode.
Various small user interface improvements.
Fixed a subtle bug that meant the combine text highlighting setting
wasn’t saved and restored correctly.

Got rid of old text comparison mode.
Various small user interface improvements.
Fixed a subtle bug that meant the combine text highlighting setting wasn’t saved and restored correctly.

Mar 30, 2011 DiffPDF 1.1.5 released.

Nov 18, 2010, includes all depended library to the windows binary, thank Fallen Nerd for the comment.

Review

AZ Lynx,Found it here this morning. Now using it at work. One of my bosses sometimes forgets that he has sent us an updated file – sometimes with different filenames. Diffpdf made it simple to compare two 127 page documents and find that, yes, he did it again.

66 Comments

  • By Fallen Nerd, October 30, 2010 @ 10:40 pm

    Can’t run it on WindowsXP SP3:
    “This application has failed to start because libxml2.dll was not found.”

  • By Fallen Nerd, October 30, 2010 @ 11:17 pm

    Finally, run it !
    Missing dll’s are:
    libxml2.dll, iconv.dll, zlib1.dll,
    jpeg62.dll, libpng12.dll
    Links to download them can be found at: http://gnuwin32.sourceforge.net/packages.html

  • By admin, November 10, 2010 @ 11:08 pm

    Thanks for your comment, my negligence

  • By fefelix, February 8, 2011 @ 11:29 pm

    How can i compile poppler for windows?

  • By JC, February 19, 2011 @ 9:29 pm

    Thanks!

    The DiffPDF current version is 1.1.5.

    Any hope for it?

    Thanks in advance!

  • By admin, February 21, 2011 @ 1:02 am

    Seems no need to update the compiled binary. any way, thanks for your information.

    Slightly changed the .pro file to make it a tiny bit more helpful.
    Added a Help button and separated Help and About information.
    Updated the README for Mac users (thanks to Dirk Loss for the info).

  • By s.m, February 27, 2011 @ 8:55 pm

    Nifty utility. Thank u !!

  • By Hans, April 3, 2011 @ 6:03 pm

    Can this tool print the changes or the changed pages, too?

  • By Joe, April 9, 2011 @ 2:21 am

    Yes, the ability to print the output of this utility would be indispensable. As it is now, it is kind of a neat tool, but not very useful. It REALLY needs a way to be able to print the output for filing or for others to use.

  • By Speci, April 14, 2011 @ 1:52 pm

    Hi,

    Diffpdf is a great tool! We have a special use case with hundrets of people who could need this tool every day. Unfortunately, the PDF file we need to compare take many minutes to load in Diffpdf (although they have just a couple of pages) what makes it useless. Do you have any idea why loading of these files takes so long? I could send you some files for analysing…

    Regards
    Speci

  • By Jerry B, April 19, 2011 @ 6:13 pm

    Hi

    I just used your App for comparing two 52 page PDF files and got excellent results.

    I do have one suggestion. In the controls section it would be helpful to have two command buttons in addition to the pages drop-down combo box. These buttons I propose to be “Next” and “Previous”. “Next” would give the next comparison file page and “Previous” the previous comparison file page. Going through 52 pages by using the combo box is really tedious. I would not get rid of the combo box as it is needed for access to a single compare file page. I have done this in a couple of applications.

    Once again, the program is great.

    Jerry B

  • By Jerry B, April 20, 2011 @ 10:21 pm

    Hi Mark,

    Thanks for accepting my suggestion. I just tried it out and it works great.

    Jerry B

  • By Make, May 21, 2011 @ 1:00 am

    This version has following 2 issues.
    1.Not able to find the font is “Bold” or not
    2.Not able to find the space issue after the word.
    Can you please fix those?
    Thanks
    Make

  • By admin, May 24, 2011 @ 12:15 am

    has forwarded the message to the developer, thanks for your report.

  • By Oisín, June 15, 2011 @ 11:55 pm

    Hi,

    Brilliant, this is going to be really useful when submitting or receiving revised documents.
    Would it be possible to have the diff computed on a whole-document rather than page-by-page basis though?
    As it is, if document B is the same as document A, except there is one extra page added near the start of document B, then all the text after that will appear to differ, even though it’s identical, just on different pages.

  • By Make, June 16, 2011 @ 12:44 pm

    Can you please let me know when can i expect the next release of this tool?

    Thanks
    Make

  • By Rob P, June 21, 2011 @ 5:05 am

    Excellent utility.

    A couple of enhancement suggestions that would help my use cases are when running from the command line to have;
    (1) an option to set Appearance comparison
    (2) an option to write a summary text files with the names of the files compared, with the summary result “Files differ on XX pages (YY pages were compared).” to a file, and to not start the GUI.

    Item (2) is for the use case where I have a lot of PDF’s to compare (~50), and most of these will be exactly the same (~45) and only a small number are different (~5).

  • By admin, June 21, 2011 @ 6:54 am

    sure, and Diffpdf 1.2.2 is ready.

  • By Make, June 21, 2011 @ 8:16 am

    Thanks for the updated version.but still this version has these 2 issues.I have posted earlier this issue.
    1.Not able to find the font is “Bold” or not in the original and compared PDF
    2.Not able to find the space issue after the word.It should highlight the word has extra space.
    It would be really helpful if you fix these issues.
    Thanks a lot.

    regards
    Make

  • By Mark Summerfield, June 21, 2011 @ 4:54 pm

    I’m the author of DiffPDF.

    Regarding the suggestions made above:

    (1) “Joe”: I’d consider adding printing, but what would be required?
    Print all differences (i.e., as 2 pages scaled to fit side-by-side)?
    (2) “Speci”: By all means send me one or two samples that load slowly
    for you.
    (3) “Make”: What you are asking for isn’t possible.
    (4) “Oisín”: If you have a document that only differs in one or a few
    pages you can use page ranges to include/exclude pages from the
    comparison.
    (5) “Rob P”: I’m planning to add a command line option to give the
    choice of appearance or text comparison. I am not planning a non-GUI
    version.

    Please reply to me directly (my email’s on my web site http://www.qtrac.eu)
    since I don’t often read the blog.

  • By admin, June 21, 2011 @ 5:17 pm

    Thanks for Mark’s reply

  • By doortokaos, June 27, 2011 @ 4:55 pm

    Thanks a lot for the windows build.

  • By AlasdairCM, June 27, 2011 @ 10:42 pm

    I have tried to download this, and tried to run it, but my Kaspersky antivirus says that it has
    Hoax.Win32.ArchSMS.ikox
    inside the .exe and won’t let me use it.

  • By admin, June 27, 2011 @ 10:59 pm

    the current version is packed with the last upx, please try the unpacked version http://soft.rubypdf.com/download/diffpdf/diffpdf-1.2.2-windows-unpack.zip, and let me know if you still get the same issue. btw, you are the first one who report this issue.

  • By David Luu, July 12, 2011 @ 8:25 am

    Thanks for compiling the Windows version. I was wondering if diffpdf, whether on Windows or other platforms, has a command line (CLI) counterpart? A CLI version would be useful for scripting and automation of diffs of PDFs (for testing or batch jobs).

  • By admin, July 12, 2011 @ 10:33 am

    will support CLI in next release

  • By David Luu, July 16, 2011 @ 10:44 am

    Good to know, what’s the roadmap/timeline for next release?

  • By David Luu, July 18, 2011 @ 2:13 pm

    Just wanted to mention that I found another tool that works similar and has a command line interface (CLI) already. Those who use this tool might want to check out the other as well, which has a Windows executable too. https://github.com/vslavik/diff-pdf, would be ideal to have a tool that merges the best of both tools (that and this).

  • By admin, July 18, 2011 @ 3:00 pm

    I noticed it before, but the user has not update the source code for a long while. how about it?

  • By Marjorie, July 19, 2011 @ 9:17 pm

    Hello,
    I noticed problems when comparing pdf with not exactely the same format (for example if the last line of a page in one document is at the first line on next page in the other document).
    Have somebody noticed the same problem? Do you know a software that doesn’t have that problem?
    Thank’s

  • By Kurt, August 9, 2011 @ 3:48 am

    Is it possible to define an origin point within DiffPDF? I am seeing cases where the PDF’s are completely highlighted due to one is shifted ever so slightly compared to the other.

  • By Jennifer, September 9, 2011 @ 12:56 am

    So far I’m pretty pleased with DiffPDF. The program is simple, clean and very user friendly, and there are no features lacking. (Of course it is not as deluxe as Adobe’s own tool, but in its own way that’s a plus!) My only complaint would be the false positives that it registered, but these weren’t serious, usually amounting to the wrongful highlight of a single word. I was impressed by the good design and the ease of use.

  • By admin, September 9, 2011 @ 1:25 am

    thanks for your comment, and sorry for my poor English, could you explain more about “My only complaint would be the false positives that it registered”, thanks in advance.

  • By Liz, September 21, 2011 @ 4:17 am

    Thank you so much for this amazing tool. My life just got soooooo much easier!

  • By Jery, September 26, 2011 @ 6:54 am

    Is the source available for the Win version? Can’t get the tar.gz to uncompress. Getting invalid archive directory. Suggestions?

  • By admin, September 26, 2011 @ 10:57 am

    I use 7zip to open it, fine, tar also works fine.

  • By Jerry, September 26, 2011 @ 6:54 pm

    Thanks, 7zip did the trick.

  • By Dave, October 3, 2011 @ 3:54 am

    I just found and used the Win32 1.2.2 build. Thank you! I needed to inspect two versions of a 40-page publication proof to verify that they differed by just one word, and DiffPDF confirmed that change in less than four seconds. Fantastic!

  • By C. L., October 12, 2011 @ 10:43 pm

    I have trouble using the poppler library for Qt4 from here:
    http://people.freedesktop.org/~aacid/docs/qt4/

    Could you provide a download including poppler files so that the directory structure is there for Windows?

    Thanks!

  • By Thinker, October 13, 2011 @ 11:43 am

    May I suggest the command option of having the software start with “Appearance” mode and difference displayed? Thanks for the great tool.

    Can you also include the binaries of the dependent libraries?

  • By Brian, October 16, 2011 @ 5:33 am

    Unfortunately, only works on single pages. Would like to see a Longest Common Subsequence implementation that works on the entire document. Not very useful otherwise.

  • By admin, October 16, 2011 @ 9:28 am

    it works on entire document, please check it again.

  • By admin, October 16, 2011 @ 9:32 am

    Mark develops it and I compile it, Now I have no time to develop it. and it does not depends any third party library, do you mean the dependent library to compile diffpdf

  • By admin, October 16, 2011 @ 9:34 am

    ok, I will provide it later.

  • By Andi, November 7, 2011 @ 5:21 pm

    Many thanks to Mark Summerfield and Steven Lee for providing this great software to compare pdf-files. Easy to use and it saved me a lot of time. Thank you again and keep up the great work!

  • By Carl, November 18, 2011 @ 1:28 am

    First of all thanks for providing such a good program.
    I am trying to compile it in windows but I am having trouble.
    I get the following error: “fatal error LNK1181: cannot open input file ‘poppler-qt4.lib’”.
    I am compiling poppler using Cygwin but the poppler-qt4.lib file is not being generated.
    Do you have any tips?
    Thanks in advance for your help.
    Carl

  • By Tony Hynes, November 19, 2011 @ 8:11 pm

    Hello,

    Just checked the windows and version and this software is excellent.

    Only one issue, there is no option to save the output, yet I have seen on other screen shots of the software that it is provided?

    Had these been removed?

  • By Bernhard, November 21, 2011 @ 4:19 pm

    Santa Clause is really early this year.

    I am looking forward to check v1.5 on Windows.

    Any idea when it is available?

  • By admin, November 22, 2011 @ 3:10 am

    It is ready now.

  • By Bernhard, November 22, 2011 @ 4:08 pm

    That was quick.

  • By Philippe, November 22, 2011 @ 4:34 pm

    Thanks to Mark and yourself for both the update and the compilation!

  • By Philippe, November 22, 2011 @ 4:48 pm

    Hi,
    I am not happy with the way the PDF files are saved. When printing a landscape page, it is stretched in the diff output.
    To stop bothering Mark (the author) I’d like to be able to modify and compile the code myself, then propose a patch.

    To ease this process, could you share a little word about the efforts you put in compiling the program on windows ?
    What compiler / dependencies are required ?
    Did you have to adapt some code ?

  • By Rich, November 27, 2011 @ 12:33 pm

    Very useful. Have installed it on OS X. Can you please make a 64-bit Windows executable version available? Thanks

  • By admin, November 29, 2011 @ 1:38 am

    the win32 version also supports 64-bit windows, If necessary, I will compile it.

  • By admin, November 29, 2011 @ 1:42 am

    If you want to compile windows version by yourself, please try http://mingw-cross-env.nongnu.org/ , I did not modify source code, but mini modification on diffpdf.pro to easily create make file.

  • By jayasree, November 29, 2011 @ 2:22 pm

    How to find the same pharases and identical quotes presents in two differents pdf?

  • By Will, December 2, 2011 @ 12:45 am

    This looks like exactly what I need; but I need to call this functionality from an existing .Net application. Does Diffpdf support an API for file comparison?

  • By paul howarth, December 5, 2011 @ 12:51 pm

    I have installed on Windows 7 and it is working well. I am comparing Japanese documents, and you will be pleased to hear that even these display and compare reliably.

    However, character-by-character difference highlighting would be very useful to me. The current ‘text’ comparison seems to assume that everything between two spaces, or an end-of-line (plus other stuff perhaps) is part of the same word, and highlights the entire word when a single mistake is found. The problem is that Japanese usually does not have spaces between words. This usually results in the entire line after a difference being highlighted.

    Sometimes, the ‘appearance’ comparison option can help, but as I am doing translation work, I am mostly interested in the text, and the ‘appearance’ option can highlight a lot of things that I don’t need.

    Any chance of getting this change?

  • By paul howarth, December 5, 2011 @ 2:17 pm

    Oops. Minor typo in my previous message which could cause confusion. “when a single mistake is found” should read “when a single difference is found”..

  • By Geoffrey Coan, December 18, 2011 @ 8:19 am

    An excellent program, really quick and the view on screen is perfect.
    Now for the bad news. When comparing two quite large documents (196 and 210 pages) that are different revisions of the same document I found that after 30 pages or so the comparison was completely screwed up and every single page thereafter was marked as a change.
    It looks like the comparison was just made of p30 to p30/p31 and if the matching text is on the bottom of p31 and spreading onto p32 then it gets confused.
    Is it possible to configure or change the “window” by which the comparison is made?

  • By Guest, January 7, 2012 @ 4:19 am

    At work, we make per-location changes to PDFs, and this tool is perfect to figure out which locations can share the same PDF in production. One thing I noticed that could use improving is to remember the last folder per side. For example, I might view the following PDFs:

    …/Territory 35/Brochure 1.pdf (on the left)
    …/Territory 67/Brochure 1.pdf (on the right)

    After determining these are the same, I move on to the next PDF for comparison. I click on “File #1…”, select “Brochure 2.pdf” in the dialog, then do the same for the other side. Once I click “Open” on the second side, I get a “cannot compare file to itself” error. Confused, I click on “File #1…” again, and notice I’m in the Territory 67 folder, not the Territory 35 folder as I originally was. It only remembered the location from the last time the open dialog popped up period, instead of remembering the location from the last time I selected a file for that side like I had expected.

    I would greatly appreciate it if you could add a checkbox to the Options dialog that read something to the effect of “Remember folder per side” that determines if it remembers one last folder overall (1 total) or one per side (2 total) and is checked by default. In most cases this would be checked, but if e.g. you had to select sequential folders (001, 002, 003, etc.), the end user would uncheck this box so as not to alternate sides.

  • By admin, January 7, 2012 @ 5:35 am

    Please have a look command line usage,maybe it can simplify your jobs.

    Although DiffPDF is a GUI program, if run from a console with two PDF files listed on the command line, DiffPDF will start up and immediately compare them in Words mode, or in Appearance mode if their names are preceded with -a or –appearance on the command line, or in Characters mode if their names are preceded with -c or –character on the command line.
    If you’re specifically looking for a command line PDF comparison tool, e.g., for automated testing, try comparepdf.
    There are also debugging options. Use –debug to be able to see the zones. Use –debug2 and –debug3 to write the texts in the order they are fed to the sequence matcher into temporary files (e.g., /tmp/page1.txt, etc.). The text reordering is done by the TextItems::columnZoneYxOrder() method in the textitem.cpp file: suggestions for improvement are welcome! (Note that when using –debug3 coordinates are output in y, x order.)

  • By Guest, January 10, 2012 @ 6:09 am

    I apologize, I should’ve been clearer when I wrote that. I only need to check each PDF once unless there are changes, in which case I view the changes and try to match it up to an existing PDF. Although it might be possible to automate the process, there’s really no reason to considering we’re not a big company and prefer “late and perfect” over “on-time and nonfunctional”. If I’m comparing two PDF’s and notice the difference between newly checked and pre-existing is, for example, Washington over Colorado respectively, I know to check Washington-based locations for a match and ignore the rest. Other times it’s legal terminology like “civil union” that I know is only used in certain areas, so I ignore everywhere else. It doesn’t really make sense to check one PDF with every other variant if you know after checking against one that the vast majority of them aren’t going to match anyways, hence why it’d be nice to remember directories per side.

    Just so you know, working with PDFs like this is something I only have to do every once in a while, as it’s only when they apply to multiple locations that I need to compare them against each other. Usually we only update one location at a time, or the update applies to a universal document.

  • By Photon, January 18, 2012 @ 8:28 pm

    I closed the window with the zoning controls – how can I get it back? Any chance to have a full screen view of only one of the files instead the two-pane view in a future version? (Nice tool, anyway)

Other Links to this Post

  1. DiffPDF – PDF 文档对比软件 - 小众软件 — November 8, 2011 @ 5:29 pm

RSS feed for comments on this post. TrackBack URI

Leave a comment