Jasper needs a configuration file that we call “profile”. In order for Jasper to accurately report local weather conditions, send you text messages, and more, you first need to generate a user profile.
To facilitate the process, run the profile population module that comes packaged with Jasper
The process is fairly self-explanatory: fill in the requested information, or hit ‘Enter’ to defer at any step. By default, the resulting profile will be stored as a YML file at ~/.jasper/profile.yml
.
Important: populate.py will request your Gmail password. Of course, this is purely optional and will never leave the device. This password allows Jasper to report on incoming emails and send you text or email notifications, but be aware that the password will be stored as plaintext in profile.yml. The Gmail address can be entered alone (without a password) and will be used to send you notifications if you configure a Mailgun account, as described below.
You need to choose which Speech-To-Text (STT) engine Jasper should use. An STT engine is basically a piece of software that takes recorded speech and transforms it into written text. If you say “foo”, but Jasper understands “bar”, it’s either a problem with your microphone or a bad or misconfigured STT engine. So choosing the right STT engine is crucial to use Jasper correctly. While most speech-recognition tools only rely on one single STT engine, Jasper tries to be modular and thus offers a wide variety of STT engines:
Important: Except PocketSphinx and Julius, all of the above STT engines transfer the microphone data over the internet. If you don’t want Google, Wit.ai and AT&T to be able to listen to everything you say, do not use these STT engines! In this case, use PocketSphinx instead.
We also do not recommend using Internet-connected STT engines for passive listening. From privacy and performance standpoints, Pocketsphinx/Julius are superior solutions.
Install the required software. Then, locate your FST model (g014b2b.fst
) and your Hidden Markov Model directory (hub4wsj_sc_8k
). If the paths below are incorrect, add the correct paths to your profile.yml
:
You also need an acoustic model in HTK format. Although VoxForge offers speaker-independent models, you will have to adapt the model and train it with you voice to get good recognition results. Please note that we do not offer support for this step. If you need help, ask in the respective forums. This stt engine also needs a lexicon file that maps words to phonemes and has to contain all words that Julius shall be able to recognize. A very comprehensive lexicon is the VoxForge Lexicon.
After creating your own acoustic model, you have to specify the paths to your hmmdefs
, tiedlist
and lexicon file in your profile.yml
:
If you encounter errors or warnings like voca_load_htkdict: line 19: triphone "r-d+v" not found
, you are trying to recognize words that contain phones are not in your acoustic model. To be able to recognize these words, you need to train them in your acoustic model.
You need an Google Speech API key to use this. To obtain an API key, join the Chromium Dev group and create a project through the Google Developers console.
Then select your project. In the sidebar, navigate to “APIs & Auth.” and activate the Speech API. Under “APIs & Auth,” navigate to “Credentials.” Create a new key for public API access.
Add your credentials to your profile.yml
:
The AT&T STT engine requires an AT&T app_key/app_secret to be present in
profile.yml. Please sign up at http://developer.att.com/apis/speech and
create a new app. You can then take the app_key/app_secret and put it into
your profile.yml
:
This implementation requires an Wit.ai Access Token to be present in profile.yml. Please sign up at https://wit.ai and copy your instance token, which can be found under Settings in the Wit console to your profile.yml:
A TTS engine does the exact opposite of an STT engine: It takes written text and transforms it into speech. Jasper supports many different TTS engines that differ by voice, intonation, “roboticness” and so on.
say
command in MacOS to synthesize speech.Important: If you’re using Google STT (or Mary TTS with someone else’s server), everything Jasper says will be sent over the internet. If you don’t want Google or someone else to be able to listen to everything Jasper says to you, do not use these TTS engines!
Install eSpeak and choose espeak-tts
as your TTS engine in your profile.yml
:
Further customization is also possible by tuning the voice
, pitch_adjustment
and words_per_minute
options in your profile.yml
:
You need to install festival (and festival’s voices). If you’ve done that, you can set festival-tts
as you TTS engine in your profile.yml
:
If you change the default voice of festival, Jasper will use this voice as well.
Install Flite and add it to your profile.yml
:
If you want to use another voice (e.g. ‘slt’), specify it in your profile.yml
:
To get a list of available voices, run flite -lv
on the command line.
Install Pico Then, you just add it to your profile.yml
:
Install the required dependencies for Google TTS. Then set google-tts
as you TTS engine in your profile.yml
:
Install the required dependencies for accessing Amazon’s Ivona Speech Cloud service. You’ll also need to sign up for free to use their service. Then set ivona-tts
as your TTS engine in your profile.yml
and also paste your Ivona Speech Cloud keys:
Simply set mary-tts
as you TTS engine in your profile.yml
. If you want, you can also change the default server to your own MaryTTS server:
Note: Currently, the demo server at mary.dfki.de:59129 is not working, so you need to set up your own MaryTTS server (which you can download here).
Make sure that you’re about to run Jasper on a Mac. Look at the casing of your computer, there should a bitten apple symbol on it. Then set osx-tts
as you TTS engine in your profile.yml
:
If you’d prefer not to enter your Gmail password, you can setup a free Mailgun account that Jasper will use to send you notifications. It’s incredibly painless and Jasper is already setup for instant Mailgun integration. Note that if you don’t enter your Gmail address, however, Jasper will only be able to send you notifications by text message (as he won’t know your email address).
In slightly more detail:
Edit your profile.yml to read:
...
mailgun:
username: postmaster@sandbox95948.mailgun.org
password: your_password
If you want to use the Weather module, but you don’t live in the US, find out the WMO ID of you local weather station (last column of the table). The WMO ID is a unique number to identify weather stations and is issued by the World Meteorological Organization (WMO).
Then, add the WMO ID to your profile:
If both location
and wmo_id
are in your profile.yml
, the wmo_id
takes precedence.
To enable Facebook integration, Jasper requires an API key. Unfortunately, this is a multi-step process that requires manual editing of profile.yml. Fortunately, it’s not particularly difficult.
Take the resulting API key and add it to profile.yml in the following format:
...
prefers_email: false
timezone: US/Eastern
keys:
FB_TOKEN: abcdefghijklmnopqrstuvwxyz
Note that similar keys could be added when developing other modules. For example, a Twitter key might be required to create a Twitter module. The goal of the profile is to be straightforward and extensible.
Jasper has the ability to play playlists from your Spotify library. This feature is optional and requires a Spotify Premium account. To configure Spotify on Jasper, just perform the following steps.
Install Mopidy with:
We need to enable IPv6:
Now run sudo vim /root/.asoundrc
, and insert the following contents:
We need to create the following new file and delete the default startup script:
Now let’s run sudo vim /root/.config/mopidy/mopidy.conf
and insert the following
Finally, let’s configure crontab to run Mopidy by running sudo crontab -e
and inserting the following entry:
Upon restarting your Jasper, you should be able to issue a “Spotify” command that will enter Spotify mode. For more information on how to use Spotify with your voice, check out the Usage guide.
Having installed and configured Jasper and its required libraries, it is worth taking a moment to understand how they interact and how the code is architected.
The program is organized into a number of different components:
jasper.py
is the program that orchestrates all of Jasper. It creates mic, profile, and conversation instances. Next, the conversation instance is fed the mic and profile as inputs, from which it creates a notifier and a brain.
The brain then receives the mic and profile originally descended from main and loads all the interactive components into memory. The brain is essentially the interface between developer-written modules and the core framework. Each module must implement isValid()
and handle()
functions, as well as define a WORDS = [...]
list.
To learn more about how Jasper interactive modules work and how to write your own, check out the API guide
Now that you have fully configured your Jasper software, you’re ready to start using it. Check out the Usage page for next steps.
Theme based on BlackTie.co. Icons from Noun Project.