Edition 45

Automating Voice Commands With Siri

All Devices

It's very common with modern mobile devices to rely on virtual "assistants" to get tasks done, whether in a handsfree situation utilizing voice commands, or just to save the trouble of tapping out search queries. On iOS these interactions take place through the Siri interface, for which developers can create custom integrations using SiriKit.

Hey Siri

How on earth would you test this aspect of your app's behavior? Ideally you'd be able to have a recording of the particular voice command or phrase used to trigger your app's Siri integration, which you could then somehow apply to the simulator or device under test. This is not currently possible, outside of rigging up some speakers!

Fortunately, we don't need to go to such lengths, because Appium has recently added a command (as of Appium 1.10), that lets you specify the text you want Siri to parse, as if it had been spoken by a person. This is great, since we don't need to test Siri itself--just its connection to our app.

The command itself is accessible via the executeScript "mobile" interface:

HashMap<String, String> args = new HashMap<>();
args.put("text", "Hey Siri, what's happening?");
driver.executeScript("mobile: siriCommand", args);

Essentially, we construct an options hash with our desired text string, and pass it to the siriCommand "mobile" method. We can run this command at any point in our automation, and it will take care of getting to the Siri prompt for us as well (we don't need to long-hold the home button, for example).

At this point we can use the typical native automation methods to verify Siri's response on the screen, tap on action items, etc...

That's basically it! There's not much to it. So let's have a look at a full example that asks Siri a math question (What's two plus two?) and verifies the result (notice how the result text shows up as accessibility IDs, which I found by looking at the page source, though I could also have used Appium Desktop):

import io.appium.java_client.MobileBy;
import io.appium.java_client.MobileElement;
import io.appium.java_client.ios.IOSDriver;
import java.util.HashMap;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.openqa.selenium.By;
import org.openqa.selenium.remote.DesiredCapabilities;

public class Edition045_iOS_Siri {

    private String APP = "";

    private IOSDriver driver;
    private WebDriverWait wait;

    private By siriCalcQ = MobileBy.AccessibilityId("2 + 2 =");
    private By siriCalcA = MobileBy.AccessibilityId("4");

    public void setUp() throws IOException {
        DesiredCapabilities caps = new DesiredCapabilities();
        caps.setCapability("platformName", "iOS");
        caps.setCapability("platformVersion", "12.1");
        caps.setCapability("deviceName", "iPhone 8");
        caps.setCapability("noReset", true);
        caps.setCapability("app", APP);

        driver = new IOSDriver<MobileElement>(new URL("http://localhost:4723/wd/hub"), caps);
        wait  = new WebDriverWait(driver, 20);

    public void tearDown() {
        try {
        } catch (Exception ign) {}

    public void testSiri() {
        HashMap<String, String> args = new HashMap<>();
        args.put("text", "What's two plus two?");
        driver.executeScript("mobile: siriCommand", args);

(Don't forget to check out the full code sample inside the runnable project on GitHub)

Discuss this Edition