Bring Voice to Your Documents

The Adobe PDF Embed API is a powerful way for web developers to incorporate PDF documents into their web site. The biggest benefit is simply having more control over the document viewing experience and being able to integrate it naturally into an existing web page. But the PDF Embed API goes far beyond that, providing a powerful event system that enables you to have a much deeper integration with the document than has previously been possible. Today I’m sharing the first of two powerful examples of this by integrating the browser’s Web Speech API.
The Web Speech API is a native browser feature that supports both speech synthesis as well as speech recognition. Today’s article will focus on the synthesis aspect while our next article will demonstrate an example of recognition.
Before we continue, let’s talk about browser support. One of the nicest aspects of the modern web is that — for the most part — an overwhelming majority of users can make use of most features. If we check the MDN Web Docs compatibility table for the Speech Synthesis feature, we see that it’s very well supported outside of Opera for Android or older Android Webviews. (Neither of these is supported by the PDF Embed API anyway, so it’s not a concern.)
The SpeechSynthesis API
Working with SpeechSynthesis API is fairly simple. The most basic example can be done by creating a new instance of the SpeechSynthesisUtterance
object and passing this to the speechSynthesis.speak
method. Here's a trivial example in CodePen. I created a simple form field and button to let the user enter some text:
And then used a bit of JavaScript to listen for the click event and speak the text in the field:
You can test this out yourself here:
As we said, this is the bare minimum code sample. The API is actually pretty rich and has numerous ways to customize it. You can specify different voices, modify pitch and rate, and even pause the voice mid-speech. Be sure to check the MDN docs for more details about the API and additional examples.
Supporting Events in the Embed PDF API
For our demo, we’re going to add speech to the text selection event. Basically, the user selects text, we recognize that action, get the selected text, and pass it to the SpeechSynthesis API. Working with events and the Embed PDF API library requires a few steps.
First, we need to tell the PDF Embed library to listen for a selection event. Let’s begin by creating an instance of the library:
The object, pdfPreview
, references the embedded PDF.
Next, we need to listen for the event. We do this in two steps. We begin by specifying the events we care about:
This is done via the enumerated value PREVIEW_SELECTION_END
as well as enabling events related to the preview of the PDF itself. The next part is to write the handler. One handler is defined for all events, but code can be used to differentiate between which event fired. Since we only have one, we don't have to worry about that:
You can read more about our supported events in our docs. Alright, so this handles listening for the event, but how do we actually get the selected text?
This is done via the APIs exposed on the pdfPreview
object we created earlier. We start by asking for an interface to those APIs:
We can then get selected text like so:
This method returns a lot of information about the selection, but the actual text will be in the data
field. This means we can get it the selected text like so:
Ok, so let’s put it all together!
Letting the PDF speak for itself
Here’s the entire JavaScript portion of our demo. It includes the initial embed of the document itself, the event handler, getting selected text, and passing it to the SpeechSynthesis API:
You can try this yourself here:
What’s next?
Hopefully, you enjoyed this interesting look at combining PDF Embed APIs with web browser APIs. In the next post we’ll take things even further — demonstrating how to use your voice to navigate a PDF!