How to add a directory to the PATH?

Question

Lucian Sasu

Asked: 2010-11-17 07:55:29 +0800 CST2010-11-17 07:55:29 +0800 CST 2010-11-17 07:55:29 +0800 CST

Create pdf from HTML book

772

There are some sites that provide books as HTML pages (e.g., legal stuff).

What can I use to create a PDF book from these pages, based on the already-existing structure?

In Windows there is Adobe Professional (commercial software). I'm guessing that Linux has something free? A solution involving scripting would be OK for me.

8 Answers

Voted

Oli · Answer 1 · 2010-11-17T08:08:02+08:00

Oli

2010-11-17T08:08:02+08:002010-11-17T08:08:02+08:00

Calibre is a pretty powerful tool for converting things into ebooks in various formats. Available in a Software Centre near you!

Don't be deceived by its less than beautiful UI, it can do a lot.

9

Jacob Peddicord · Answer 2 · 2010-11-17T08:03:22+08:00

Jacob Peddicord

2010-11-17T08:03:22+08:002010-11-17T08:03:22+08:00

The easiest way? File > Print from your browser. Select Print to File as your printer, and it will ask you where you want it. Be sure to mark PDF. Hit "Print" and it will actually be saved to your drive instead of actually printing.

4

Sabacon · Answer 3 · 2010-11-17T08:26:51+08:00

Sabacon

2010-11-17T08:26:51+08:002010-11-17T08:26:51+08:00

Htmldoc can be useful, see it here; http://www.htmldoc.org/ it is available from software center, sadly the 1.8 version has a problem with unicode encoded files but on many occasions it can still be a saviour, the problem is fixed in the 1.9 development version.

I usually use the wonderful scrapbook extension here; http://amb.vis.ne.jp/mozilla/scrapbook/ for Firefox to capture the web pages, use the editing tools in scrapbook to fix them up if that is needed and then use htmldoc to convert all pages to PDF.

4

SiliconChaos · Answer 4 · 2010-11-17T09:49:09+08:00

SiliconChaos

2010-11-17T09:49:09+08:002010-11-17T09:49:09+08:00

I would recommend using OpenOffice/LibreOffice to create the PDF. As a test I downloaded the Wget manul (all in one page) and then opened the HTML page in OponOffice and clicked on the "Export Directly to PDF" button. It created the PDF with with an index from the table of contents.

In the past I've found this to be the easiest way to convert HTML pages to PDF. It also allows you to make changes without much effort.

Screenshots:

Wget manual exported to PDF using Open Office
Export Directly to PDF option in Open Office

3

Nichod · Answer 5 · 2010-11-17T11:35:50+08:00

Nichod

2010-11-17T11:35:50+08:002010-11-17T11:35:50+08:00

You could try http://www.xhtml2pdf.com/. It's a converter for HTML/XHTML and CSS to PDF. All written in Python.

3

frabjous · Answer 6 · 2010-11-17T12:36:14+08:00

frabjous

2010-11-17T12:36:14+08:002010-11-17T12:36:14+08:00

I've actually voted for the calibre solution. But here's another you could try. Install AbiWord. It can do conversions between any formats it knows from the command line. To convert all the .html files in a folder to .pdf you could do:

for file in *.html ; do abiword --to=pdf "$file" ; done

For higher-level typography (but arguably more complicated), another option would be PrinceXML.

2

loevborg · Answer 7 · 2011-01-08T07:18:58+08:00

loevborg

2011-01-08T07:18:58+08:002011-01-08T07:18:58+08:00

Depending on the html document to be printed, you might have the best results using pandoc. This is one of the most versatile HTML-to-LaTeX converters. The resulting .tex file can be turned to PDF quite easily, using xelatex or pdflatex. Lots of options are available if you are willing to delve into LaTeX syntax and packages. This may not work well if embedded images and fancy HTML styles should be preserved.

2

Geppettvs D'Constanzo · Answer 8 · 2011-04-16T09:52:44+08:00

Geppettvs D'Constanzo

2011-04-16T09:52:44+08:002011-04-16T09:52:44+08:00

In google-chrome, you can create a pdf file fo a whole site by using an extension. I personally use the Web2PDF Converter extension that makes a PDF just in a click.

Here is a screenshot of this plugin, provided by google extensions web store site.

enter image description here

Additionally, you can see a PDF created by me with this tool, by downloading the next (right clic, save target as): http://geppettvs.servehttp.com/resources/askubuntu-com.pdf (some browsers like google-chrome may allow you to see this online).

And if you wish to edit those PDF's created by the extension in order to remove the digital signature placed by the extension in the bottom of each page or to remove anything else, take a look at this: Remove text information from a PDF?

Good luck!

1

Create pdf from HTML book

How to add a directory to the PATH?

How do I install .run files?

How to list all installed packages

How do I get the CPU temperature?

Unable to lock the administration directory (/var/lib/dpkg/) is another process using it?

How can I add a user as a new sudoer using the command line?

Change folder permissions and ownership

How do you restart Apache?

How can I uninstall software?

How can PPAs be removed?