Edition 36

Using The 'nativeWebTap' Capability

It's easy to forget that Appium is used not only for native apps but also for testing websites by automating mobile browsers like iOS's Mobile Safari. And Appium's support of the WebDriver API for Safari is so extensive that it's easy to forget it's not the exact same thing as Selenium!

Actually, to facilitate automation of Safari, Appium does make use of some of the Selenium Project's code (the Selenium Atoms) -- thanks Selenium friends! The Atoms are a way to get the Selenium/WebDriver API working via JavaScript, rather than native browser code, which is a great thing for Appium since the only way we can automate Mobile Safari is via JavaScript (over a remote debugging connection).

Usually, this little implementation detail is minor enough to ignore. Sometimes, however, it can show up in certain limitations. At the end of the day, Atoms-driven actions are just JavaScript, meaning that clicks and taps are simply simulated by the firing of JavaScript events (rather than registering with the native iOS touch interfaces, and trickling down into Safari from the "top", as it were).

One clue that this is happening to you is when you encounter the following scenario:

  1. You find an element (no element not found error)
  2. You click the element (no error thrown on click)
  3. Nothing happens in your app

This is odd! The element was found, and Appium thinks the click was successful. So what could have gone wrong? Well, the answers can be various and complex. One common culprit is that app developers often assign app behaviors to non-standard JavaScript events. For example, if I decide that (in the mobile version of my web app) a certain action should only take place after the touchend event is fired on a particular element, then I will not be able to automate tapping that element via the Selenium Atoms.

If I load up the website in an iOS simulator or device, I'll be able to trigger the action just fine. This is because Safari knows to trigger that particular event when it detects a "real" touch (when the iOS subsystems notify Safari that a tap has happened). But all the Atoms do is fire one or more JavaScript events at the element under consideration---and in this case, it won't do the trick.

To work around this and other related issues, Appium has a special capability: nativeWebTap. When it is set to true, Appium will perform some magic behind the scenes any time you call click() on an element found in the browser. It will essentially try to determine the location of that element on the webpage, translate that location to native screen coordinates, account for any system bars and the like, and then generate a native (XCUITest-driven) tap on that location. If all goes well, the result is that the iOS subsystems detect a touch at a certain place, pass that info on to Safari, which then determines that an element has been tapped.

The net result is that you should be back in business: you'll be able to trigger your app's action and continue testing the user flow. You might also feel a little sense of satisfaction that you generated a more "real" tap! (Or not--what's really real when it comes to testing?)

Because Appium has to do some mathematicky stuff to make nativeWebTap work, it's best not to use the capability unless you have some reason to. And an alternative approach might be to simply bug your app developer and ask them if they used some non-standard touch handler, and whether they could just stop and do things the normal way instead. Another option would be to simply fight fire with fire (pun very much intended): figure out what event the developer decided the element is listening for, and fire that event on the element yourself using executeScript! It's up to you--either way, Appium's got you covered.

By way of example, I put together a little test website that demonstrates the situation I've been describing. Clicking the link in a desktop browser will do nothing (as intended), but clicking it in Mobile Safari will work just fine. Trying to make the click succeed without using nativeWebTap will be impossible, since the navigation behavior is listening for a different JavaScript event.

If you're interested in seeing this discrepancy yourself (with and without nativeWebTap), have a look at the full code example for this edition. (I don't reproduce it here since the only item of interest is the use of the capability itself).