iOS-Specific Touch Action Methods
In a previous edition of Appium Pro, we saw how to use the W3C Actions API to automate complex gestures, including drawing some (amazing) stick figure art. The same API can obviously perform simple gestures like swiping, pinching, and zooming. However, there's occasionally a downside to using these general methods, which is that they bypass the officially-recognized APIs for standard complex actions provided by the underlying mobile automation tool, for example iOS.
If you ever run into difficulty using the W3C Action API, Appium provides direct access to these vendor-supported action methods as well. In this article we'll take a look at the ones available for iOS. Because these are not part of the WebDriver spec, Appium provides this access by overloading the
executeScript command, as you'll see in the examples below.
This command ultimately calls the
XCUIElement.swipe* family of methods provided by XCUITest, and thus takes two parameters: a direction (whether to swipe up, down, left, or right), and the ID of an element within which the swipe is to take place (Appium defaults to the entire Application element if no element is specified).
Note: for this command and all other
mobile:commands which have an element as a parameter, the value which should be supplied is the internal ID of the element, which is not normally needed as part of Selenium/Appium testing. To get it in the Java client, you can call
element.getID()(potentially needing to cast
// swipe up then down Map<String, Object> args = new HashMap<>(); args.put("direction", "up"); driver.executeScript("mobile: swipe", args); args.put("direction", "down"); driver.executeScript("mobile: swipe", args);
Unfortunately, XCUITest does not provide any parameters to modify the speed or distance of the swipe. For that, use the more general Actions API.
If you want to try and make sure that each movement of your gesture moves a view by the height of the scrollable content, or if you want to scroll until a particular element is visible, try
mobile: scroll. It works similarly to
mobile: swipe but takes more parameters:
element: the id of the element to scroll within (the application element by default). Call this the "bounding element"
direction: the opposite of how direction is used in
mobile: swipe. A swipe "up" will scroll view contents down, whereas this is what a scroll "down" will do.
name: the accessibility ID of an element to scroll to within the bounding element
predicateString: the NSPredicate of an element to scroll to within the bounding element
true, and if
elementis set to a custom element, then simply scroll to the first visible child of
// scroll down then up Map<String, Object> args = new HashMap<>(); args.put("direction", "down"); driver.executeScript("mobile: scroll", args); args.put("direction", "up"); driver.executeScript("mobile: scroll", args); // scroll to the last item in the list by accessibility id args.put("direction", "down"); args.put("name", "Stratus"); driver.executeScript("mobile: scroll", args); // scroll back to the first item in the list MobileElement list = (MobileElement) driver.findElement(By.className("XCUIElementTypeScrollView")); args.put("direction", "up"); args.put("name", null); args.put("element", list.getId()); driver.executeScript("mobile: scroll", args);
To pinch (described by a two-finger gesture where the fingers start far apart and come together) or to zoom (described by the inverse gesture where fingers start together and expand outward), use
mobile: pinch, which calls
XCUIElement.pinch under the hood. As with the other methods described so far, you can pass in an
element parameter defining the element in which the pinch will take place (the entire application by default).
The only required parameter is
- Values between 0 and 1 refer to a "pinch"
- Values greater than 1 refer to a "zoom"
An additional optional parameter
velocity can be sent, which corresponds to "the velocity of the pinch in scale factor per second" according to Apple's docs.
// zoom in on something Map<String, Object> args = new HashMap<>(); args.put("scale", 5); driver.executeScript("mobile: pinch", args);
The best way to tap on an element is using
element.click(). So why do we have
mobile: tap? This method allows for extra parameters
y signifying the coordinate at which to click. The nice thing is that this coordinate is either screen-relative (if an
element parameter is not included, the default), or element-relative (if an element parameter is included).
This means that if you want to tap at the very top left corner of an element rather than dead center, you can!
// tap an element very near its top left corner Map<String, Object> args = new HashMap<>(); args.put("element", ((MobileElement) element).getId()); args.put("x", 2); args.put("y", 2); driver.executeScript("mobile: tap", args);
There's more to tapping than single-tapping! And while you can certainly build a double-tap option using the Actions API, XCUITest provides a
XCUIElement.doubleTap method for this purpose, and it could presumably have greater reliability than synthesizing your own action.
In terms of parameters, you should send in either an
element parameter, with the ID of the element you want to tap, or both an
y value representing the screen coordinate you wish to tap.
// double-tap the screen at a specific point Map<String, Object> args = new HashMap<>(); args.put("x", 100); args.put("y", 200); driver.executeScript("mobile: doubleTap", args);
Not to be confused with a double-tap, a two-finger-tap is a single tap using two fingers! This method has only one parameter, which is required: good old
element (it only works in the context of an element, not a point on the screen).
// two-finger-tap an element (assume element object already exists) Map<String, Object> args = new HashMap<>(); args.put("element", ((MobileElement) element).getId()); driver.executeScript("mobile: twoFingerTap", args);
Many iOS apps allow a user to trigger special behavior by tapping and holding the finger down on a certain UI element. You can specify all the same parameters as for
y) with the same semantics. In addition you must set the
duration parameter to specify how many seconds you want the touch to be held.
// touch and hold an element Map<String, Object> args = new HashMap<>(); args.put("element", ((MobileElement) element).getId()); args.put("duration", 1.5); driver.executeScript("mobile: touchAndHold", args);
Another commonly-implemented app gesture is "drag-and-drop". As with all of these gestures, it's possible to build a respectable drag-and-drop using the Actions API, but if for some reason this doesn't work, XCUITest has provided a method directly for this purpose. It's a method on the
XCUICoordinate class, and in my opinion the name 'dragFromToForDuration' isn't the most accurate representation of it.
Really, what's going on is that we're defining a start and an end coordinate, and also the duration of the hold on the start coordinate. In other words, we have no control over the drag duration itself, only on how long the first coordinate is held before the drag happens. What parameters do we use?
element: an element ID, which if provided will cause Appium to treat the coordinates as relative to this element. Absolute screen coordinates otherwise.
duration: the number of seconds (between 0.5 and 6.0) that the start coordinates should be held
fromX: the x-coordinate of the start position
fromY: the y-coordinate of the start position
toX: the x-coordinate of the end position
toY: the y-coordinate of the end position
// touch, hold, and drag based on coordinates Map<String, Object> args = new HashMap<>(); args.put("duration", 1.5); args.put("fromX", 100); args.put("fromY", 100); args.put("toX", 300); args.put("toY", 600); driver.executeScript("mobile: dragFromToForDuration", args);
And with that our tour of the special iOS-specific gesture methods is complete! If you want to see a working example of some of the scroll and swipe functionality, check out this article's code on GitHub, which makes use of a new scrolling list view added to The App!