2012 October 15
The perennial problem: you have a PDF of a paper, and you want to extract one of its figures to show in a talk or something.
If you’re a normal person, you open it up big in your PDF viewer, take a screenshot, and call it a day. Or your PDF reader just lets you draw a box and save its contents.
But maybe you’re running Linux and don’t have commercial software available; and maybe you want to make a vector screenshot, not a bitmapped one, so that full detail will be maintained and you can scale the figure around. Evince doesn’t support draw-a-box-and- save-it, so you’re sunk.
But you’re not! You’ll need
pdfcrop — all widely-available. You can
draw a box on the page in
xpdf, and you can also set up
xpdf to run a
command when you press a key, with the box parameters able to be substituted
into the command line. Then the
pdftocairo utility provided by
poppler- utils can rerender the page to PDF while cropping to your selected box.
pdfcrop makes the margins nice.
You need to put this short helper script in
$PATH. It takes the coordinates from
xpdf, converts them to the
system needed by
pdftocairo, then chooses an output filename (
and makes the magic happen. There’s nothing complex about it and it can easily
be monkeyed with to fit your preferences.
Then put this line in
bind ctrl-e any "run(xpdf-extract-helper.sh '%f' %p %x %y %X %Y)"
Straightforward enough, I hope. I chose Control-E as the keybinding for “extract”.
Now just open up your PDFs in
xpdf, draw boxes, and hit Control-E! Here’s a
sample result … converted to a PNG so that it can render inline, so that in a
way it misses the whole point of the exercise. But there’s little PDF figure
behind it, I swear.