Building Voice-Enabled Products With Amazon Alexa

With the explosion of Alexa-powered devices popping up everywhere, it isn’t much of a stretch to envision a future full of devices that are eager to have a conversation with us. Devices that are driven by artificial intelligence and woven into the fabric of our everyday lives — standing by to fulfill our every whim.

[embedded content]

Businesses everywhere are scrambling to voice-enable their products, so let’s dive into what it takes to actually do that.

Hint: It’s really easy.

This guide will step you through configuring an Amazon Alexa Skill and Lambda function to control the relay attached to an IoT device — specifically, an Intel Edison-powered starter kit. But with some simple modification to the code, you could make Alexa control nearly anything you like — anything digital, that is — which is pretty much everything these days.

Note: The purpose of this blog post is to demonstrate voice capabilities, so we’re going to focus exclusively on the Alexa/Lambda portion and assume that you already have a device connected to AWS IoT.

Looking for an on ramp?

This is a how-to guide intended for developers or tech-savvy entrepreneurs and business managers looking for an proven entry point into A.I.-powered business systems.

The End Result

When you’re done with this guide, you’ll be able to operate the relay connected to an IoT device simply by telling Alexa to make it happen.

And because we’re controlling a simple electrical relay, you can make that relay operate nearly anything, from a household light switch to a robotic servo to an electrical motor — the only limit is your imagination.

If you don’t have an Echo or other Alexa-powered device, you can use: echosim.io.

Here are some of the highlights of what we’re building:

Voice control: Provides a natural language interface for controlling apps and devices
Serverless execution: Provides a way to run computing functions without a full server

But this is just a starting point, the architecture can be extended in any number of ways. So play around with it and have fun!

How it works.

Tip: If you’re feeling ambitious, Amazon offers a separate Alexa Voice Service that allows you to embed Alexa into any device — i.e. create your own Echo. But we’ll save that for another day.

For this demo we’ll use two cloud-based services from Amazon Web Services (plus AWS IoT to manage the device):

Alexa Skills Kit: Converts voice commands into computing functions
Lambda: Executes the computing functions

The structure provides a highly scalable and extensible starting point towards voice-enabling your products.

And with direct access to all of the other AWS services at your fingertips, the sky’s the limit when it comes to how much you can expand and integrate this system to better serve your customers.

What You’ll Need

Right off the bat, let’s get the initial requirements covered.

Create an AWS account.

If you don’t already have an AWS account, go ahead and set one up.

You’ll also need an Amazon Developer account.

Verify user permissions.

And if you aren’t using an administrator-level user account for AWS, you’ll need to make sure your account has full control over the following services:

Lambda,
Identity & Access Management (IAM), and
AWS IoT

Without the correct permissions, you won’t be able to create the necessary services and connections.

Connect the device.

If you don’t already have a device configured on AWS IoT, you can configure the Lambda function to do something else. Or you could just run tests — AWS IoT has a shadow feature that allows you to simulate a device.

Copy the device’s endpoint.

There’s one bit of information you’ll need from your IoT “Thing” — the endpoint identifier.

So go ahead and copy the endpoint prefix, you’ll need it in the next step.

And you can get that by going to the IoT Registry, opening up your Thing and clicking on the “Interact” tab. All you need is the endpoint prefix, not the entire string.

So let’s get to it…

Note: If you’d like to learn more about where I came up with these settings and code, they are adapted from Amazon’s Color Expert Blueprint example.

Step 1: Create the Lambda Function

Go to your Lambda Dashboard.

Click the “Get Started Now” button.

Type “alexa” into the search box and then click the “alexa-skills-kit-color-expert” blueprint for Node.js.

Then on the next screen, pick “Alexa Skills Kit” from the triggers dropdown (if it isn’t already selected) and hit “Next.”

Now we need to to configure the function.

To start, choose a name (e.g. edisonRelay). Then create a new Role and enter a name (e.g. edisonRelay). Select anything you like for the Policy templates, we’re going to detach it later.

Delete the default contents of the Lambda Code Function box and paste in the code below:

Lambda function


'use strict';

/* ----------------------- IoT Configuration -------------------------------- */

var config = {};

config.IOT_BROKER_ENDPOINT = "XXXXXXXXXXXXXX.iot.us-east-1.amazonaws.com".toLowerCase();

config.IOT_BROKER_REGION = "us-east-1";

config.IOT_THING_NAME = "MyEdison";

// Load AWS SDK libraries
var AWS = require('aws-sdk');

AWS.config.region = config.IOT_BROKER_REGION;

// Initialize client for IoT
var iotData = new AWS.IotData({endpoint: config.IOT_BROKER_ENDPOINT});

/* -------------------- end: IoT Configuration ------------------------------ */


/* ------------ Helpers that build all of the responses --------------------- */

function buildSpeechletResponse(title, output, repromptText, shouldEndSession) {

    return {
        outputSpeech: {
            type: 'PlainText',
            text: output,
        },
        card: {
            type: 'Simple',
            title: `SessionSpeechlet - ${title}`,
            content: `SessionSpeechlet - ${output}`,
        },
        reprompt: {
            outputSpeech: {
                type: 'PlainText',
                text: repromptText,
            },
        },
        shouldEndSession,
    };

}

function buildResponse(sessionAttributes, speechletResponse) {

    return {
        version: '1.0',
        sessionAttributes,
        response: speechletResponse,
    };

}

/* ---------- end: Helpers that build all of the responses ------------------ */


/* ----------- Functions that control the skill's behavior ------------------ */

function getWelcomeResponse(callback) {

    // If we wanted to initialize the session to have some attributes we could add those here.
    const sessionAttributes = {};
    const cardTitle = 'Welcome';
    const speechOutput = 'Welcome to the Alexa powered products demo. ' +
        'Please tell me if you want the light on or off by saying, turn the light on';
    // If the user either does not reply to the welcome message or says something that is not understood, they will be prompted again with this text.
    const repromptText = 'Please tell me if you want the light on or off by saying, ' +
        'turn the light on';
    const shouldEndSession = false;

    callback(sessionAttributes, buildSpeechletResponse(cardTitle, speechOutput, repromptText, shouldEndSession));

}

function handleSessionEndRequest(callback) {

    const cardTitle = 'Session Ended';
    const speechOutput = 'Thank you for using the Alexa powered products demo. Have a nice day!';
    // Setting this to true ends the session and exits the skill.
    const shouldEndSession = true;

    callback({}, buildSpeechletResponse(cardTitle, speechOutput, null, shouldEndSession));

}

function createFavoriteRelayStatusAttributes(desiredRelayStatus) {

    return {desiredRelayStatus,};

}

/**
 * Sets the relay state in the session and prepares the speech to reply to the user.
 */
function setRelayStatusInSession(intent, session, callback) {

    const cardTitle = intent.name;
    const desiredRelayStatusSlot = intent.slots.Status;
    var shadowRelayStatus = false;
    let repromptText = '';
    let sessionAttributes = {};
    const shouldEndSession = false;
    let speechOutput = '';

    if (desiredRelayStatusSlot) {

        const desiredRelayStatus = desiredRelayStatusSlot.value;
        sessionAttributes = createFavoriteRelayStatusAttributes(desiredRelayStatus);
        speechOutput = "The light has been turned " + desiredRelayStatus;
        repromptText = "You can ask me if the light is on or off by saying, is the light on or off?";
        
        /*
         * Update AWS IoT
        */
        // Determine relay postition within shadow
        if (desiredRelayStatus === 'on') {shadowRelayStatus = true;}
        var payloadObj={ "state": { "desired": { "RelayState": shadowRelayStatus } } };

        //Prepare the parameters of the update call
        var paramsUpdate = {

          "thingName" : config.IOT_THING_NAME,
          "payload" : JSON.stringify(payloadObj)

        };

        // Update IoT Device Shadow
        iotData.updateThingShadow(paramsUpdate, function(err, data) {

          if (err){
            console.log(err); // Handle any errors
          }
          else {
            console.log(data);
          }

        });

    }
    else {

        speechOutput = "I'm not sure if you want the light on or off. Please try again.";
        repromptText = "I'm not sure if you want the light on or off. You can tell me if you " +
            'want the light on or off by saying, turn the light on';

    }

    callback(sessionAttributes, buildSpeechletResponse(cardTitle, speechOutput, repromptText, shouldEndSession));

}

function getRelayStatusFromSession(intent, session, callback) {

    let desiredRelayStatus;
    const repromptText = null;
    const sessionAttributes = {};
    let shouldEndSession = false;
    let speechOutput = '';

    if (session.attributes) {
        desiredRelayStatus = session.attributes.desiredRelayStatus;
    }

    if (desiredRelayStatus) {
        speechOutput = `You turned the light ${desiredRelayStatus}. Congratulations!`;
        shouldEndSession = true;
    }
    else {
        speechOutput = "I'm not sure if you want the light on or off, you can say, turn the light " +
            ' on';
    }

    // Setting repromptText to null signifies that we do not want to reprompt the user.
    // If the user does not respond or says something that is not understood, the session
    // will end.
    callback(sessionAttributes, buildSpeechletResponse(intent.name, speechOutput, repromptText, shouldEndSession));

}

/* --------- end: Functions that control the skill's behavior --------------- */


/* ----------------------------- Events ------------------------------------- */

/**
 * Called when the session starts.
 */
function onSessionStarted(sessionStartedRequest, session) {
    console.log(`onSessionStarted requestId=${sessionStartedRequest.requestId}, sessionId=${session.sessionId}`);
}

/**
 * Called when the user launches the skill without specifying what they want.
 */
function onLaunch(launchRequest, session, callback) {

    console.log(`onLaunch requestId=${launchRequest.requestId}, sessionId=${session.sessionId}`);

    // Dispatch to your skill's launch.
    getWelcomeResponse(callback);

}

/**
 * Called when the user specifies an intent for this skill.
 */
function onIntent(intentRequest, session, callback) {

    console.log(`onIntent requestId=${intentRequest.requestId}, sessionId=${session.sessionId}`);

    const intent = intentRequest.intent;
    const intentName = intentRequest.intent.name;

    // Dispatch to your skill's intent handlers
    if (intentName === 'RelayStatusIsIntent') {setRelayStatusInSession(intent, session, callback);}
    else if (intentName === 'WhatsRelayStatusIntent') {getRelayStatusFromSession(intent, session, callback);}
    else if (intentName === 'AMAZON.HelpIntent') {getWelcomeResponse(callback);}
    else if (intentName === 'AMAZON.StopIntent' || intentName === 'AMAZON.CancelIntent') {handleSessionEndRequest(callback);}
    else {throw new Error('Invalid intent');}

}

/**
 * Called when the user ends the session.
 * Is not called when the skill returns shouldEndSession=true.
 */
function onSessionEnded(sessionEndedRequest, session) {

    console.log(`onSessionEnded requestId=${sessionEndedRequest.requestId}, sessionId=${session.sessionId}`);
    // Add cleanup logic here

}

/* --------------------------- end: Events ---------------------------------- */


/* -------------------------- Main handler ---------------------------------- */

// Route the incoming request based on type (LaunchRequest, IntentRequest, etc.) The JSON body of the request is provided in the event parameter.
exports.handler = (event, context, callback) => {

    try {

        console.log(`event.session.application.applicationId=${event.session.application.applicationId}`);

        /**
         * Uncomment this if statement and populate with your skill's application ID to
         * prevent someone else from configuring a skill that sends requests to this function.
         */
        /*
        if (event.session.application.applicationId !== 'amzn1.echo-sdk-ams.app.[unique-value-here]') {
             callback('Invalid Application ID');l
        }
        */

        if (event.session.new) {
            onSessionStarted({ requestId: event.request.requestId }, event.session);
        }

        if (event.request.type === 'LaunchRequest') {
            onLaunch(event.request,
                event.session,
                (sessionAttributes, speechletResponse) => {
                    callback(null, buildResponse(sessionAttributes, speechletResponse));
                });
        }
        else if (event.request.type === 'IntentRequest') {
            onIntent(event.request,
                event.session,
                (sessionAttributes, speechletResponse) => {
                    callback(null, buildResponse(sessionAttributes, speechletResponse));
                });
        }
        else if (event.request.type === 'SessionEndedRequest') {
            onSessionEnded(event.request, event.session);
            callback();
        }

    }
    catch (err) {callback(err);}

};

/* ----------------------- end: Main handler -------------------------------- */

Replace XXXXXXXXXXXXXX (IOT_BROKER_ENDPOINT) with the IoT endpoint prefix you copied in What You’ll Need.

It wouldn’t be a bad idea to give the code a quick once over as well to make sure the copy/paste didn’t distort the syntax. Also, double-check IOT_BROKER_REGION for your respective AWS region, and IOT_THING_NAME for your correct device name.

Give your Lambda settings a quick review on the next screen.

Bump up the Role permissions.

Now we need to make sure Lambda has the correct permissions to access your AWS IoT instance.

Go to your IAM Roles Page and click on the Role you created above (on the Lambda Configure Function screen).

If you used a policy template above (on the Lambda Configure Function page), click on the respective “Detach Policy” link to remove it — but keep the Lambda Basic Execution Role. And then hit the “Attach Policy” button.

Check the “AWSIoTFullAccess” line item and hit the “Attach Policy” button to move on.

You should end up with something that looks like the screenshot above.

Get the Lambda function ARN.

Before diving into the Alexa skill, you’ll want to grab the ARN (Amazon Resource Name) from your new Lambda function. So jump back to your Lambda Dashboard.

Select your Lambda function (radio button on the left) and click “Show ARN” from the “Actions” menu.

Copy your ARN for the next step.

Step 2: Create the Alexa Skill

Go to your Alexa Dashboard.

Click “Get started” for the “Alexa Skills Kit.”

Then click on “Add a New Skill.”

Select the “Custom Interaction Model” radio button, enter a Name for your new skill, and choose an “Invocation Name.”

The Invocation Name is a word or phrase you’ll use to let Alexa know that you want to use your custom skill. For example, based on an Invocation Name edison, we would say: “Alexa, open edison.”

Click the “Next” button to move on.

On the next screen, simply copy/paste the Intent Schema, Slot Values and Sample Utterances from the respective blocks below.

For Slot Values: click “Add Slot Type,” then enter LIST_OF_STATUSES for the type and include your values. Hit “Save” when ready.

Note: This is where the magic happens. It is through configuring the settings on this page that you define what Alexa can do — actually creating the skill.

Intent Schema

{
  "intents": [
    {
      "intent": "RelayStatusIsIntent",
      "slots": [
        {
          "name": "Status",
          "type": "LIST_OF_STATUSES"
        }
      ]
    },
    {
      "intent": "WhatsRelayStatusIntent"
    },
    {
      "intent": "AMAZON.HelpIntent"
    }
  ]
}

Slot Type

on
off
dimmed

Sample Utterances

WhatsRelayStatusIntent what's the status of the light
WhatsRelayStatusIntent what is the status of the light
WhatsRelayStatusIntent what's the status of light
WhatsRelayStatusIntent what is light status
WhatsRelayStatusIntent light status
WhatsRelayStatusIntent my light status
WhatsRelayStatusIntent get my light status
WhatsRelayStatusIntent get light status
WhatsRelayStatusIntent give me the light status
WhatsRelayStatusIntent give me light status
WhatsRelayStatusIntent what the light status is
WhatsRelayStatusIntent what light status is
WhatsRelayStatusIntent yes
WhatsRelayStatusIntent yup
WhatsRelayStatusIntent sure
WhatsRelayStatusIntent yes please
RelayStatusIsIntent turn the light {Status}

Then click “Next.”

Select “AWS Lambda ARN” and “North America,” and then paste in the ARN you copied from the last part of step #1. Hit “No” for “Account Linking.” Then click “Next.”

You don’t need to do anything with the Test page at this point. You’ll want to come back to it after everything is wired up. So click “Next” to move on.

Fill out the Publishing Information blocks as you see fit, but don’t forget to include the logos.

And of course, click “Next” when you’re done.

Fill out the privacy information as needed. Because this is just a demo, we can keep things pretty simple.

Hit the “Save” button (not “Submit for Certification”).

Your new Alexa Skill should be good to go, so let’s fire it up.

Step 3: Chat Away!

As I mentioned above, you can use echosim.io as your Alexa device. If you’d like to use it instead of an Echo (or similar), bring up the website and login with the Amazon Developer account you used above.

Then press — and hold — the button and say “Alexa, open Edison” (or whatever your Invocation Name is from step #2).

If everything is properly configured, Alexa should promptly respond with:

“Welcome to the Alexa powered products demo. Please tell me if you want the light on or off by saying, turn the light on.”

…and walk you through the rest from there.

Congratulations — you’re an Alexa pro!

Play around with the code and settings. Have fun with it.

Troubleshooting

If you hit any snags, be sure to check out the logs by clicking on the “logs” link in your Lambda function.

Then click on the latest set of entries.

You can also run tests directly from the Alexa Skill. Just go to the “Test” page (the page you skipped in step #2) and enter in an utterance to see how Lambda responds.

Tip: You can copy/paste a “Test Event” for your Lambda function from the “Lambda Request” box on the Alexa Skill Test page (see screenshot below).

It’s a simple pipeline so you really shouldn’t have any trouble tracking down issues.

There isn’t much code with this one, but here’s a Github repo of the source code.

Take it to the Next Level

Hopefully this post has convinced you that it’s pretty easy to voice-enable your products using Alexa. But of course, this is just the start. Take what you picked up here and apply it to your own products.

Because Lambda is really the engine behind the system, you can make it do pretty much anything you can code. The only limit is your imagination.

You can dig deeper into Alexa at the Amazon Developer documentation. Or try out one of the other speech recognition providers — Google Cloud, IBM Watson, Microsoft Azure, etc.

Enjoy!