menu

Edition 35

Writing XPath Queries That Work

All Languages
All Platforms
All Devices

I've written a couple editions of Appium Pro on the topic of finding elements reliably, including why you should consider not using XPath at all as part of your element-finding strategy. There are two main reasons for not using XPath:

  1. XPath queries can be strictly hierarchical in nature, with the result that any change in the structure of your app (even accidental, or OS-caused) means a failure to find an element (or worse, finding the wrong element).
  2. With Appium specifically, using the XPath locator strategy can be expensive and slow, because of the extra work required to convert an app's UI hierarchy to XML, and then to match up found XML nodes with actual UI elements.

These are really good reasons to avoid XPath if at all possible. Sometimes, however, there is no alternative! (This goes for both Appium and Selenium, by the way: everything I'm about to say is equally valid for both automation tools.) Sometimes, you've been able to determine that XPath is, in your particular case, not actually expensive or slow (how did you determine this? You tried it!), and so you might prefer XPath to some of the platform-specific locator strategies (iOS Predicate String, or Android UISelector) in order to help your test code be more cross-platform.

Either way, what's important is that you write good XPath queries. In essence, these are queries which are anchored by elements uniquely identified by unchanging criteria. Let's first take a look at some examples of queries which fail to meet this description:

  • //* (this is the worst possible query. It selects every single element!)
  • //android.widget.Layout/android.widget.Layout[3]/android.widget.Layout/android.widget.TextView[2] (this query relies too much on hierarchical structure. If an extra layout is added to the hierarchy, or anything is shuffled around, it is likely to break. In addition, the query is not cross-platform).

How can we make better queries? The first is to remember that XPath offers the use of predicates, which allow the restriction of matched nodes based on special criteria. For example, we can find any element which has a certain attribute (say a text attribute):

  • //*[@text="foo"]

(How did we know that the text attribute was a thing? We looked at the XML source of the app of course! Maybe by using Appium Desktop.) We can also use XPath functions for predicates involving functions other than equality:

  • //*[contains(@text, "f")] (find any element which contains the letter "f").

In practice, the same attributes are not in use within XML sources produced for iOS and Android, so using predicates is often not a cross-platform approach. It can become cross-platform, however, when you remember that XPath also allows boolean operations!

  • //*[@text="foo" or @label="foo"] (find any element whose text (Android) or label (iOS) attribute is "foo").

Notice that I haven't really been referring to the specific type of node in these searches, and instead I have been using the wildcard matcher (*). Doesn't this severely hurt performance since the XPath engine has to search so many more nodes? Not really. XPath searches are pretty fast, unless you have a truly gigantic hierarchy, and that doesn't happen too often. (Of course you can always optimize for a given platform by including the type of element you are expecting). For Appium, the truly expensive part of using XPath is in generating the XML, and matching found XPath nodes to native UI elements---not in the XPath search itself.

For our last consideration, consider an example where there is really no uniquely identifying information on a particular element. We might have a list view with many different elements inside, of an indeterminate ordering, many of which have duplicate text. We're interested in just one of these elements based on external criteria (maybe the text of an ancestor element). Fear not! We can stay in the mostly safe zone of XPath by ensuring that we use an anchor element which does have some unique attribute:

  • //*[@content-desc="foo" or @name="foo"]//*[@content-desc="bar" or @name="bar"] (a cross-platform accessibility label search for elements called bar that descend from elements called foo).

In the above case, we considered the foo element our "anchor" element, and essentially scoped our query inside of it, making it much more robust. It can work the other way as well, using a child element (or a sibling element) as an anchor:

  • //ancestor::*[*[@text="foo"]][@text="bar"] (find the ancestor of a node with text "foo"; the ancestor must also have text "bar").

These XPath queries may be complex or ugly, but they are not particularly bad from the perspective of maintainability, assuming your app structure guarantees a minimum ancestor/descendant relationship between elements. It simply becomes a process of triangulating an unchanging query for your element based on attributes of other elements! All this to say, if you're not an XPath expert already it's worth finding some good tutorials and making sure you understand the possibilities of XPath for writing better queries, before rejecting the strategy out of hand.

The way XPath is actually used in practice in many functional testsuites should absolutely be avoided. But this avoidance should not be cargo-culted any more than the use of XPath. In many cases the question of speed and performance with respect to XPath must be determined experimentally, in the context of your particular app. Don't assume without checking that it will be too slow to be useful! Hopefully these reflections and examples help to bring a bit of nuance to the "XPath is bad" conversation that I often find myself in, as I'm sure you do as well.

Discuss this Edition