Extracting PDF Figures as PDFs in Linux

The perennial problem: you have a PDF of a paper, and you want to extract one of its figures to show in a talk or something.

If you’re a normal person, you open it up big in your PDF viewer, take a screenshot, and call it a day. Or your PDF reader just lets you draw a box and save its contents.

But maybe you’re running Linux and don’t have commercial software available; and maybe you want to make a vector screenshot, not a bitmapped one, so that full detail will be maintained and you can scale the figure around. Evince doesn’t support draw-a-box-and- save-it, so you’re sunk.

But you’re not! You’ll need xpdf, poppler-utils, and pdfcrop — all widely-available. You can draw a box on the page in xpdf, and you can also set up xpdf to run a command when you press a key, with the box parameters able to be substituted into the command line. Then the pdftocairo utility provided by poppler- utils can rerender the page to PDF while cropping to your selected box. pdfcrop makes the margins nice.

You need to put this short helper script in your $PATH. It takes the coordinates from xpdf, converts them to the system needed by pdftocairo, then chooses an output filename (fig${n}.pdf) and makes the magic happen. There’s nothing complex about it and it can easily be monkeyed with to fit your preferences.

Then put this line in ~/.xpdfrc:

bind ctrl-e any "run(xpdf-extract-helper.sh '%f' %p %x %y %X %Y)"

Straightforward enough, I hope. I chose Control-E as the keybinding for “extract”.

Now just open up your PDFs in xpdf, draw boxes, and hit Control-E! Here’s a sample result … converted to a PNG so that it can render inline, so that in a way it misses the whole point of the exercise. But there’s little PDF figure behind it, I swear.

McLean et al (2012 ApJ 746 23), fig 2. I assert without proof that I have a PDF version of this image.
McLean et al (2012 ApJ 746 23), fig 2. I assert without proof that I have a PDF version of this image.

Questions or comments? For better or worse this website isn’t interactive, so send me an email or, uh, Toot me.

To get notified of new posts, try subscribing to my lightweight newsletter or my RSS/Atom feed. No thirsty influencering — you get alerts about what I’m writing; I get warm fuzzies from knowing that someone’s reading!

Later: Barycentric Julian Dates in Python

Earlier: Finishing Up at Berkeley

See a list of all posts.

View the revision history of this page.