I have many JPEG files in a directory, and I want to convert them to PDF and concatenate them together to make a single document.
How can this be done?
I would prefer using the command line, as this process will be faster.
I have many JPEG files in a directory, and I want to convert them to PDF and concatenate them together to make a single document.
How can this be done?
I would prefer using the command line, as this process will be faster.
From the
imagemagick
package, use theconvert
command:You will get a single pdf containing all jpg in the current folder. The option
-auto-orient
reads the image's EXIF data to rotate the image.Install IM with:
sources: stackoverflow imagemagick options
Edit: Note that images will be out of specific order if they are not numbered. if you have 10 or more you need to name them ending filename01.jpg...filename99.jpg etc. The leading zeros are required for proper ordering. If you have 100 or more 001...999.
Unfortunately,
convert
changes the image quality before "packing it" into the PDF. So, to have minimal loss of quality, is better to put the originaljpg
, (works with.png
too) into the PDF, you need to useimg2pdf
.I use these commands:
A shorter one liner solution also using
img2pdf
as suggested in the comments**Make PDF
img2pdf *.jp* --output combined.pdf
(optional) OCR the output PDF
ocrmypdf combined.pdf combined_ocr.pdf
Below is the original answer commands with more command and more tools needed:
This command is to make a
pdf
file out of everyjpg
image without loss of either resolution or quality:ls -1 ./*jpg | xargs -L1 -I {} img2pdf {} -o {}.pdf
This command will concatenate the
pdf
pages into one document:pdftk *.pdf cat output combined.pdf
And finally, I add an OCRed text layer that doesn't change the quality of the scan in the pdfs so that they can be searchable:
pypdfocr combined.pdf
An alternative to using
pypdfocr
:Worked for me (BUT warning!
+compress
options turns off compression and resulting PDF will be big!):or even:
From ubuntuforums.org, the
+compress
helps it to not hang. NOTE: the +compress turns off compression. The machine I was working on at the time seemed to hang ?forever?(I did not wait forever though to find out.) without the +compress option. Your Mileage May Vary quite a bit! RTFM on imagemagick.org option -compress, maybe experiment with -compress < type> if you have slow compression/hanging problems to find out what will work for you.I'm curious nobody pointed out pdfjam, which is a super efficient way to merge images/pdf into a pdf:
will create for you a pdf in A4 format for all
.jpg
files, usually named with a-pdfjam.pdf
at the end. To force a specific output name, you have a--outfile <your output>
option!As far as I can see, there is no re-encoding of the file, making the command pretty fast compared to
convert
.To install pdfjam, I'm not sure to know what's the most efficient way (it comes automatically with LaTeX), but you can try:
or maybe
Open jpg or png file with LibreOffice Writer and export as PDF.
I hope, this is simple way to export pdf.
The following solution also relies on ImageMagick's
convert
but is a bit more sophisticated because:pdfimages -j file.pdf img
.) At the moment, this only works with PNG – see the comment by @dma_k below.Instructions:
Concatenate all your one-page PDF files with PDFtk as follows:
Although convert does the job, it tries to open all the source files together and if you have a lot of files and do not have a huge amount of RAM you can run out it.
So as an alternative you can run the following commands in a terminal while being in the folder where the jpg files are.
This converts each image to a single page pdf, one by one, without overloading the system. Then:
This merges the pdfs into a single pdf and deletes the single page ones.
Using
img2pdf
you can do that.But sometimes you may need your images converted to document in an order by timestamp or by size or by name.To make that possible this script does that work.
ls -trQ | tr '\n' ' ' | sed 's/$/\ --output\ mydoc.pdf/' | xargs img2pdf
In place of mydoc.pdf, enter name of output file as your wish.
Option of
ls
command( instead of-tr
use these as per your need)-S
, sort by file size, largest first-t
, sort by modification time, newest first-X
, sort alphabetically by entry extension-r
, reverse order while sortingIt is humblesome - but file-size can explode - to avoid exploding file-size you can do these steps :
a) At first you need to export with "gimp" the *.jpeg-files to *.jpg-files. (jpeg is Apple format - jpeg and jpg are both NOT the same !). jpg-file would need a small white or black 'passepartout' (=frame).
b) With Android and app "photocompress" I compress the jpg files to size under 300 KBytes each.
c) then back to Desktop of Ubuntu you can edit these files with Libre-Office and create a pdf-map with them.
Surely somebody knows how this works from a) to c) simply in terminal ?
The side-effect of this is, that it can happen, because of correct byte-size the recipient with bad $mickrosaft has then posters, but it is not your fault.