I have an important pdf where I need to extract the source image, as lossless as possible (e.g. using png) For some reason, it seems that the source image is made out of 226 image stripes, and when I extract these e.g. with
pdfimages -png name.pdf out-
I get the 227 tiny stripes. That is not what I want. Is there a way to get one single image instead? Using pdfimages -list tells me the info about the stripes, and using e.g. the above pdfimages -png name.pdf out- gives me the 227 single images. One image is e.g. 1604 px width and 5 px height. So far as I checked them, all images seem to be 5 px height, and with 227 single images, I should get one single image of 1604 x 1135 px instead.
Update I forgot to add what Ryan J. Yoder wrote below was also my own thought on the issue, meaning that the pdf was indeed created by splitting the original image into 227 stripes.
And in conclusion, if that is so (pdfimages -list says it is so) is there a way to automatically create one single image out of the stripes e.g. by using graphicsmagick.
Ghostscript can be used to get images of the pages as they appear in a viewer, e.g. for
.png
images with 300 dpi namedout_001.png
,out_002.png
, … fromin.pdf
:You could use ImageMagick to 'convert' the PDF to a png using the command line:
convert -density 300 page.pdf page.png
or whatever density (DPI) you'd like.