Do you read books? Do you like listening to Audiobooks? Do you wish to create your own Audiobook from any pdf? Here's how you can do it.
You can also follow along with the video tutorial of the same!
Repository for Ultimate Resource in python. Drop a star if you find it useful! Got anything to add? Open a PR on the same!
Its time to code!
Let's get started!
You can find the code at my GitHub Repository
First, we need to install the necessary libraries. We require two libraries to build Audiobook using Python.
1. PyPDF2
A Pure-Python library built as a PDF toolkit. It is capable of extracting document information splitting documents page by page merging documents page by page cropping pages merging multiple pages into a single page encrypting and decrypting PDF files and more!
So open your terminal and run the following command.
pip install PyPDF2
If you wish to know more about it, you can refer to the documentation.
2. pyttsx3
pyttsx3 is a text-to-speech conversion library in Python. Unlike alternative libraries, it works offline, and is compatible with both Python 2 and 3.
So open your terminal and run the following command.
pip install pyttsx3
If you wish to know more about it, you can refer to the documentation.
Now that we have installed the packages, we can import them in our program.
import pyttsx3
import PyPDF2
Now we need to open our file in reading format and store into book
. The name of my pdf file is demo.pdf
. rb
stands for reading mode.
book = open('demo.pdf','rb')
Now I will call PyPDF2
's PdfFileReader
method on book
and store it into pdf_reader
pdf_reader = PyPDF2.PdfFileReader(book)
Now let's calculate the number of pages in our pdf by using numPages
method on pdf_reader
and store in num_pages
.
num_pages = pdf_reader.numPages
Now let's initialize pyttsx3
using init
method and let's print playing Audiobook
play = pyttsx3.init()
print('Playing Audio Book')
Now, let's run a loop for the number of pages in our pdf file. A page
will get retrieved at each iteration.
for num in range(0,num_pages):
page = pdf_reader.getPage(num)
data= page.extractText()
play.say(data)
play.runAndWait()
Moving forward, let's extract the text from our page using extractText
method on our page and store it into data
.
Next, we will call say
method on data
and finally we can call runAndWait
method at the end.
Run the python script and your Audiobook will play.
That's it. We are done. You can find the code at my GitHub Repository
If you have any queries or suggestions, feel free to reach out to me.
You can connect with me on Twitter.
You should definitely check out my other Blogs:
- Python 3.9: All You need to know
- The Ultimate Python Resource hub
- GitHub CLI 1.0: All you need to know
- Become a Better Programmer
- How to make your own Google Chrome Extension
- You are Important & so is your Mental Health!
Resources:
See you in my next article, Take care!