Creating PDF files from HTML source code using Python
The vast majority of us are probably aware that there are a lot of websites that do not provide authorization to its visitors to download the material of the website as a file formatted in PDF format. Instead, they will either request that consumers purchase a premium version of their service or they will state that they do not provide this kind of download service in the form of a PDF file.
On the other hand, the PDFKit module is available in programming languages such as Python, which allows for the creation of PDF files. With the aid of the PDFKit module in the Python programming language, which will be covered in the next tutorial, we will learn how to convert HTML files to PDF format.
language.
Understanding the Python PDFKit module
The creation of a PDF file in Python may be accomplished in a number of ways; however, PDFKit is regarded as one of the most effective methods. PDFKit is a wrapper for the wkhtmltopdf utility, which is used by programmers to convert HTML files into PDF format with the assistance of Webkit. PDFKit operates as a wrapper for the tool. It transforms HTML into PDF, complete with a variety of picture formats, HTML forms, and other complicated printed content.
documents.
How to Install the Python PDFKit Module?
We need “pip,” a framework that manages packages and is needed in order to install Python modules from reputable public repositories. This is necessary in order for us to be able to install the Python module. After we have ” pip “, we can install the PDFKit module by entering the command from a Windows command prompt (CMD) or terminal, as shown below: Syntax:
-
$ pip install pdfkit
At this point, it is necessary for us to install the prerequisites for the PDFKit module, which is the wkhtmltopdf program. For Ubuntu/Debian:
Input the command that is shown below: Command:
-
$ sudo apt-get install wkhtmltopdf
About Microsoft Windows:
Step 1: To get started, go to https://wkhtmltopdf.org/downloads.html and click the “Download” button to save the wkhtmltopdf utility to your computer.
Step 2: Go to the folder containing binary files and direct the PATH environment variable to point there.
Variables.
Verifying the Installation
When the module has been installed, we can test it by creating a new, empty file in the Python programming language and include the following import line in it: File: verify.py
-
import
pdfkit
Now, save the file you just edited above, and then open a terminal and run the following command to run the file: Syntax:
-
$ python test_file.py
If the Python program file that was just shown did not return any errors, then the module was successfully installed. On the other hand, in the event that an error is thrown, you should attempt to reinstall the module. Also, it is advised that you consult the module’s official documentation.
Let’s get started using PDFKit in the office right now.
Python.
Working with PDFKit in Python
Using a variety of different examples, this section will walk you through the process of converting HTML files to PDF format files by using the PDFKit module of the Python programming language.
Programmers may use the from file() method of the Python PDFKit module to convert a local HTML file into a PDF format. This is made possible by the module. Let us look at the following example to see how the same principle applies: Example:
-
# importing the required module
-
import
pdfkit
-
-
# configuring pdfkit to point to our installation of wkhtmltopdf
-
config = pdfkit.configuration(wkhtmltopdf = r
“C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe”
)
-
-
# converting html file to pdf file
-
pdfkit.from_file(
‘sample.html’
,
‘output.pdf’
, configuration = config)
Output:
The explanation for this is .
The necessary module has been imported into the code that was just shown for you. After that, we have used the configuration() method to create a variable that we have referred to as config. This variable contains the address of the execution file that the wkhtmltopdf program uses. The from file() function of the PDFKit module was then called, and the address of the HTML file, the location where the PDF file was to be placed, and the configuration were all specified as the values for the function’s arguments.
As a direct consequence of this, the local HTML file has been successfully transformed into a PDF file.
By using the from url() method of the Python PDFKit module, we are able to change the URL of the website into a PDF file. This is another option for us. Let us look at the following example to see how the same principle applies: Example:
-
import
pdfkit
-
-
# configuring pdfkit to point to our installation of wkhtmltopdf
-
config = pdfkit.configuration(wkhtmltopdf = r
“C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe”
)
-
-
# converting url to pdf file
-
pdfkit.from_url(
‘https://www.javatpoint.com/tic-tac-toe-in-python’
,
‘output2.pdf’
, configuration = config)
Output:
Explanation: Related:
The necessary module has been imported into the code that was just shown for you. After that, we have used the configuration() method to create a variable that we have referred to as config. This variable contains the address of the execution file that the wkhtmltopdf program uses. The PDFKit module’s from url() method was then used, with the URL, the location where the PDF file was to be placed, and the configuration being the three arguments that were specified for the function.
As a direct consequence of this, the URL was successfully transformed into a PDF file.
In addition, programmers are given the ability to save strings inside PDF files by utilizing the from url() method, which is made available by the PDFKit module. Consider the following illustration of the same thing, which serves as an example: Example:
-
# importing the required module
-
import
pdfkit
-
-
# configuring pdfkit to point to our installation of wkhtmltopdf
-
config = pdfkit.configuration(wkhtmltopdf = r
“C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe”
)
-
-
my_string =
“”
”
-
<html>
-
<head>
-
<body>
-
<div
class
=
“container”
>
-
<h1>STRING TO PDF</h1>
-
<p>This is my sample string that I wanted to stored in a PDF file</p>
-
</div>
-
</body>
-
</head>
-
</html>
-
“”
”
-
-
# storing string to pdf file
-
pdfkit.from_string(my_string,
‘output3.pdf’
, configuration = config)
Output:
Explanation: Related:
The necessary module has been imported into the code that was just shown for you. After that, we have used the configuration() method to create a variable that we have referred to as config. This variable contains the address of the execution file that the wkhtmltopdf program uses. After that, we have made use of the PDFKit module’s from string() method, during which we specified the arguments as the string variable, the location of the PDF file to be saved, and the settings.
As a direct consequence of this, we have achieved the goal of successfully storing the string inside a PDF file.