I have some .pdf files that I would like to convert to my preferred reading format of .cbr or .cbz or, if this isn't directly possible, I need to extract all pages from the .pdf as images and then compress them into my format of choice. I have only been able to save pages one at a time with Document Viewer. Obviously, I'd like to do it a little quicker. I have tried pdfsam, pdf shuffler, and pdfmod all with no luck. I am using Ubuntu 11.10.
OK well, I did some more research and although tohuwawohu's method does work, I found it easier to use a program called pdftoppm to achieve what I wanted done. Since I am pretty much a layperson when it comes to using command line apps, I will do my best to explain how I got this to work for me.
Navigate to the folder containing the .pdf you wish to edit and open a terminal there. I did this by using the sample command:
Let's say the file I want to edit is called Sample.pdf What I want to do is use pdftoppm to create image files of each page of the .pdf. Several formats can be chosen (see the man pages link above) but I prefer to use .png. The basic command looks like this:
or in the example above:
This command creates an image file of each page in the same folder as the original .pdf file with names like Sample-01.png, Sample-02.png and so on. I have tried it with the .png and .jpeg extensions successfully. .jpg is apparently not supported.
Then I just use Archive Manager by selecting all the newly-created image files, right-clicking, and choosing "Compress" from the context menu. I then choose the archive format I prefer (in this case .cbz or Comic Book Zip) and create the new archive.
Now I have a shiny new .cbz file called Sample.cbz which I can then view with my Comix reader!
Hopefully what I have posted above makes enough sense that someone else can learn from it. If I need to change it in any way please let me know.
I'm not very familiar with *.cbr / *.cbz, but it seems you'll have to combine two steps:
Regarding step 1, you could use ImageMagick's
convert
command. You can feedconvert
with a PDf comprising multiple pages, andconvert
will return each page as single graphics file. I've tested it with a text scanned at 400 dpi, and the following command resulted in nice single JPGEs:(credits regarding the
-quality
option: this forum entry)As a result, you get
000.jpeg
,001.jpeg
and so on. Just zip them into a.cbz
file, and you're done.You could even combine both steps by "concatenating" them:
(make sure that there aren't any other JPEGs in your current working directory, since using the code above, zip will move all JPEGs into the cbz file)
I have written a simple bash script for exactly this purpose, you will need poppler installed, so:
Here is the bash script (save it as convert_to_cbz.sh):
To use the bash script:
Hopefully this will be useful for someone!
Try using calibre to directly convert the .pdf to .cbr or .cbz.
It seems that the easiest way is using Acrobat Pro.
File
→Export
→Image
→JPEG
, it will export each page as a single JPG.If you prefer a CBR file, rar the folder instead of zipping it, then change the extension from .rar to .cbr.