Edition 20

Making Your Appium Tests Fast and Reliable, Part 2: Finding Elements

This article is the second in a multi-part series on test speed and reliability, inspired by a webinar I gave recently on the same subject (you can watch the webinar here). You might also want to check out Part 1: Test Flakiness.

One of the biggest surface indicators of instability or flakiness is an element not being found. Finding elements is a natural place for problems to arise, because it is when we try to find an element that our assumption about an app's state and its actual state are brought together (whether in harmony or in conflict). We certainly want to avoid potential problems in finding elements which we can do something about, for example using selectors that are not unique, or trying to find elements by some dynamic attribute which cannot be relied on. This means that knowledge of your app and its design are essential. What is likely to change? What isn't? Which elements have accessibility IDs?

Locator Strategies

Before going further, it's worth saying what we mean by "finding elements", and "accessibility ID", for example. In Appium (as with Selenium), actions can be taken on specific objects in the app UI. These objects (corresponding to elements in a webpage, hence the name of the findElement API command) must be "found" before it is possible to interact with them. There are different ways of finding elements. Take a look at the example call below:

Components of a find element call

In this example, By.className represents a so-called "locator strategy" called "class name", and Button represents a "selector" which the strategy uses to find one or more elements. The result of this call is (if all goes well) an object of type WebElement, which comes with the rich set of interaction APIs you rely on for your testing.

"Class name" is just one of a number of locator strategies available in Appium, and refers to a platform-specific UI object class name, for example XCUIElementTypeButton or android.widget.Button. Already you can see that perhaps this locator strategy isn't always ideal; what if you are testing a cross-platform app? Would you need to have a different set of code to find an iOS button or an Android button? If you rely on the "class name" locator strategy, the answer is yes.

There's another problem with this strategy: it's often the case that there is more than one element of any given type in the hierarchy. Thus you might very well find a button with this locator strategy, but will it be the button you want? So we could say that the "class name" locator strategy is not a good choice because it is platform-specific (leads to branched iOS and Android code), and too general (hard to uniquely identify an element with). What other options are there? Have a look at this table of the full set:

All the locator strategies

As you can see, many of the locator strategies were carried over from Selenium, though not all are supported or even make sense in Appium (at least when automating a native app). Appium has also introduced a number of its own strategies, such as "accessibility id", to reflect the fact (and take advantage of the fact) that we're dealing with mobile app UIs and an entirely different automation stack.

XPath

In another Appium Pro article, I go into detail about the XPath locator strategy, and why it should be avoided. To summarize here, many people find the XPath strategy attractive, because it guarantees that any element in the UI can be found. The problem is that some elements can only be found using selectors that are "brittle", meaning they are liable to find no element, or a different element, if anything changes in your app's design. XPath can also be slow with Appium, because it entails sometimes multiple recursive renderings of the UI hierarchy.

Accessibility ID

What should we use instead? When possible, I recommend using the "accessibility ID" locator strategy, because it is (a) cross-platform, (b) unique, and (c) fast. Both iOS and Android have the concept of an accessibility label, though on iOS it's called "accessibility ID" and on Android it's called "content description" (or "content-desc"). Since the accessibility label is a string set by developers, it can be a unique identifier. Of course, the point of setting accessibility labels is for users, not for testers, so care must be taken. Setting good user-centric accessibility labels is usually sufficient to provide the kinds of identifiers testers also need, and that is the best practice (i.e., set accessibility labels on everything for users, and then take advantage of this for testing). In the Appium Java client, finding elements by accessibility ID involves using the MobileBy strategy:

WebElement el = driver.findElement(MobileBy.AccessibilityID("foo"));

Since testers don't always have the ability to influence the app's development, sometimes accessibility labels are not available, or are not unique. What else could we use?

iOS-specific Locator Strategies

In the same Appium Pro article I referenced earlier, I went into detail on some iOS-specific locator strategies that could be used as a substitute for XPath, because they are hierarchical query-based strategies. The most robust is the "-ios class chain" strategy, which allows you to use a "lite" version of something like XPath, mixed together with iOS predicate format strings.

The benefit of this locator strategy is that it allows for complex queries while remaining in most cases much speedier than XPath. The drawback, of course, is that it is platform-specific, so requires branching your code (or adding further distinctions to your object models). As an example of what you can do, check out this command:

String selector = "**/XCUIElementTypeCell[`name BEGINSWITH "C"`]/XCUIElementTypeButton[10]";
driver.findElement(MobileBy.iOSClassChain(selector));

What we're doing here is finding the 10th button which is a child of a table cell anywhere in the UI hierarchy which has a name beginning with the character "C". That's quite the query! Because of the more rigid form of class chain queries, the performance guarantees are better than those of XPath.

Android-specific Locator Strategies

A similar trick is available for Android, in the guise of a special parser the Appium team implemented which supports most of the UiSelector API. We make this parser available via the "-android uiautomator" locator strategy, and the selectors should be strings which are valid bits of Java code beginning with new UiSelector(). Let's have a look at an example:

String selector = "new UiSelector().className(\"ScrollView\").getChildByText(new UiSelector().className(\"android.widget.TextView\"), \"Tabs\")";
driver.findElement(MobileBy.AndroidUIAutomator(selector));

Once again, we make use of a MobileBy strategy since this strategy is available only for Appium. What's going on here is that we have constructed a string which could be used as valid UiAutomator test code, but in fact will be parsed and interpreted by Appium when the command is sent. According to the semantics of the UiSelector API, we're saying that we want the first TextView element we find with the text "Tabs", which is also a child of the first ScrollView in the hierarchy. It's a bit clunkier than XPath, but it can be used in similar ways, and again with a better performance profile in most cases.

As with the iOS class chain strategy, the main downside here is that selectors are going to be platform-specific. (Additionally, we can't support arbitrary Java and there are limits to what we can provide from the UiSelector API).

Determining Which Selectors to Use

So far we've seen some good recommendations on which strategies to use to find elements reliably. But how do you know which selectors to use in conjunction with those strategies? I said before that knowledge of your app is required in order to do this correctly. How do you get that knowledge of your app? If you're one of the app developers, you can simply have a look at the code, or maybe you remember that you gave a certain element a certain accessibility label. If you don't have access to the code, or if you want a method that will show you exactly what Appium sees in your app, then it's best to use Appium Desktop.

Appium Desktop is a GUI tool for running Appium and inspecting apps. You can use it to launch "inspector sessions" with arbitrary desired capabilities. Inspector sessions show you a screenshot of your app, its UI hierarchy (as XML), and lots of metadata about any element you select. It looks like this:

The Appium Inspector

One of the great things about the Inspector is that, when you click on an element in the hierarchy, it will intelligently suggest locator strategies and selectors for you. In the image above, you can see that the top suggestion for the selected element is the "accessibility id" locator strategy, used in conjunction with the selector "Login Screen".

Things can get a bit more complex, certainly, but the Appium Desktop Inspector is always a great place to start when figuring out what's going on with your app hierarchy. It's especially useful if you run into issues where you think an element should exist on a certain view: just fire up the Inspector and manually look through the XML tree to see if in fact the element exists. If it doesn't, that means Appium (read: the underlying automation frameworks) can't see it, and you'll need to ask your app developer why.

And that concludes our discussion of finding elements reliably in Appium---or at least one aspect of it. Just because you can find an element with the correct locator strategy doesn't mean it will always be there when you look. Make sure to also check out the next part, on waiting for app states (including the presence of elements).