A reader recently asked if we could provide an example of how to zoom in and out using touch gestures. I decided to use Google Maps for the demo, because it is installed by default on Android emulators and thought that this would be a straightforward task. Surprisingly, it ended up being much trickier than I expected for the same reason that Google Maps often frustrates me while I'm driving!
Google Maps must implement a lot of custom logic for interpreting touch events. I'm sure multiple usability experts charted people's fingers and intentions, recorded video footage of people smearing finger-oil across screens for hours, and came up with a set of functions to describe the pattern for each gesture. The end result is that our sterile touch actions generated by machines don't trigger the UI reactions we'd expect. The same often happens when I take my life in my hands trying to change the view while driving.
(I've enabled touch gesture debugging on the emulator, so we can see the gestures that Appium simulates.)
The simplest gesture composed of two touch inputs (fingers), being placed down on the screen, moved toward or away from each other, and then lifted off the screen, does not budge the UI at all.
What I ended up doing was making the gesture more complex, in order to better simulate the more organic and imperfect actions of a real person. I found that the most important variable that led to a successful zooming action was for the gesture to be fast, around 25 to 50 milliseconds. I added a short segment of moving very quickly, followed by a 100ms pause, and then continuing the rest of the zoom gesture in 25-50 milliseconds.
Even this approach was not very satisfying, as it is not always consistent and the zoom out is more powerful than the zoom in. Plus, I couldn't find a simple way to perform a slow and controlled zoom despite it being so easy to do manually.
We achieved our goal of demonstrating a zoom gesture though, and there is plenty to learn from in the code used for this.
First off, I took advantage of what we learned from our post about Android intents and activities to launch the Google Maps app directly to a view of a chosen set of geo coordinates:
DesiredCapabilities caps = new DesiredCapabilities();
caps.setCapability("platformName", "Android");
caps.setCapability("deviceName", "Android Emulator");
caps.setCapability("automationName", "UiAutomator2");
caps.setCapability("appPackage", "com.google.android.apps.maps");
caps.setCapability("appActivity", "com.google.android.maps.MapsActivity");
caps.setCapability("intentAction", "android.intent.action.VIEW");
caps.setCapability("optionalIntentArguments", "-d geo:46.457398,-119.407305");
driver = new AppiumDriver(new URL("http://localhost:4723/wd/hub"), caps);
Next, I created a method which would build our "zoom interaction". We'll go over the methods used here, but for the history of Appium's gesture API design, and more examples you can visit a previous article on the topic.
The new methods for building touch actions compliant with the W3C Webdriver specification are all located under the org.openqa.selenium.interactions
namespace of the Appium and Selenium Java clients. The end goal is to be able to construct a list of Interaction
objects which we can then pass to driver.perform()
in order to send the actions to the Appium server to then run on the device.
So, my method to create a zoom interaction returns a list of interactions.
private Collection<Sequence> zoom(Point locus, int startRadius, int endRadius, int pinchAngle, Duration duration) {
// convert degree angle into radians. 0/360 is top (12 O'clock).
double angle = Math.PI / 2 - (2 * Math.PI / 360 * pinchAngle);
// create the gesture for one finger
Sequence fingerAPath = zoomSinglefinger("fingerA", locus, startRadius, endRadius, angle, duration);
// flip the angle around to the other side of the locus and get the gesture for the second finger
angle = angle + Math.PI;
Sequence fingerBPath = zoomSinglefinger("fingerB", locus, startRadius, endRadius, angle, duration);
return Arrays.asList(fingerAPath, fingerBPath);
}
Each interaction in the list which we will later pass to driver.perform()
represents the movement of one finger on the device's touchscreen. A pinch zoom requires two fingers, though the only difference between their movement is the direction relative to the center of the pinch, which I named the "locus". startRadius
and endRadius
refer to the distance from the locus that the fingers move. duration
will the the length of time this action takes and pinchAngle
is how twisted from directly up/down the fingers are while pinching (the examples from earlier all had angles of 45 degrees).
Now let's look at the zoomSingleFinger
method which actually uses the Appium client methods to create the actions for each finger. The action I decided to make, and which experimentally yielded acceptable results, was to first move the finger very quickly a small distance from the startRadius
towards the endRadius
. My finger then pauses for a moment, before resuming its path towards the endRadius
position.
private Sequence zoomSinglefinger(String fingerName, Point locus, int startRadius, int endRadius, double angle, Duration duration) {
PointerInput finger = new PointerInput(PointerInput.Kind.TOUCH, fingerName);
Sequence fingerPath = new Sequence(finger, 0);
double midpointRadius = startRadius + (endRadius > startRadius ? 1 : -1) * 20;
// find coordinates for starting point of action (converting from polar coordinates to cartesian)
int fingerStartx = (int)Math.floor(locus.x + startRadius * Math.cos(angle));
int fingerStarty = (int)Math.floor(locus.y - startRadius * Math.sin(angle));
// find coordinates for first point that pingers move quickly to
int fingerMidx = (int)Math.floor(locus.x + (midpointRadius * Math.cos(angle)));
int fingerMidy = (int)Math.floor(locus.y - (midpointRadius * Math.sin(angle)));
// find coordinates for ending point of action (converting from polar coordinates to cartesian)
int fingerEndx = (int)Math.floor(locus.x + endRadius * Math.cos(angle));
int fingerEndy = (int)Math.floor(locus.y - endRadius * Math.sin(angle));
// move finger into start position
fingerPath.addAction(finger.createPointerMove(Duration.ZERO, PointerInput.Origin.viewport(), fingerStartx, fingerStarty));
// finger comes down into contact with screen
fingerPath.addAction(finger.createPointerDown(PointerInput.MouseButton.LEFT.asArg()));
// finger moves a small amount very quickly
fingerPath.addAction(finger.createPointerMove(Duration.ofMillis(1), PointerInput.Origin.viewport(), fingerMidx, fingerMidy));
// pause for a little bit
fingerPath.addAction(new Pause(finger, Duration.ofMillis(100)));
// finger moves to end position
fingerPath.addAction(finger.createPointerMove(duration, PointerInput.Origin.viewport(), fingerEndx, fingerEndy));
// finger lets up, off the screen
fingerPath.addAction(finger.createPointerUp(PointerInput.MouseButton.LEFT.asArg()));
return fingerPath;
}
First we construct a PointerInput
to represent a finger (as opposed to a mouse pointer) and use it to construct an empty Sequence
which will hold individual Actions
. Actions are like steps in the overall movement of a single finger.
I then calculate the coordinates on the screen which we will move the finger between for each action.
Next, we create the actions, adding them to the sequence as we go. We start with a PointerMove
action to move the finger into its starting position. Then we add a PointerDown
action, putting our finger in contact with the touchscreen. Now comes the quick initial movement, which we've set to take just one millisecond to complete. We then add a special Pause
action which waits for 100 milliseconds in our case. Followed by another PointerMove
action which completes the gesture the rest of the way and takes as much time as was passed into the function as duration
. Lastly, we add a PointerUp
action to the sequence to remove our finger from the screen.
Because I wanted simple "zoomIn" and "zoomOut" methods and didn't want to specify all these parameters every time I wanted to zoom, I created two more functions which set some defaults.
private Collection<Sequence> zoomIn(Point locus, int distance) {
return zoom(locus, 200, 200 + distance, 45, Duration.ofMillis(25));
}
private Collection<Sequence> zoomOut(Point locus, int distance) {
return zoom(locus, 200 + distance, 200, 45, Duration.ofMillis(25));
}
All it takes to now reproduce the zooming from the video at the beginning of the article is to call our methods:
@Test
public void ZoomInAndOut() throws InterruptedException {
// tap center to dismiss toolbars
WebElement map = driver.findElementById("com.google.android.apps.maps:id/mainmap_container");
map.click();
Rectangle mapCoordinates = map.getRect();
Point center = getCenter(mapCoordinates);
driver.perform(zoomOut(center, 450));
Thread.sleep(1000);
driver.perform(zoomIn(center, 450));
Thread.sleep(1000);
driver.perform(zoomOut(center.moveBy(0, 250), 300));
Thread.sleep(1000);
driver.perform(zoomIn(center.moveBy(0, -250), 300));
Thread.sleep(3000);
}
I originally wrote the methods above because I wanted a generic way to build zoom gestures, but the complexity and trickiness of the Google Maps touch gesture logic resulted in having to build a rather specific set of actions. The timing is very important, but because I used durations if you specify a longer path (on a larger device let's say), the action might then be too fast. At least with the methods written this way, it was easy to experiment with many different combinations of values. A better solution would take the total length the finger has to travel and then calculate a duration based on that.
Feel free to play with actions and experiment! They can get way more complex from here. I'm curious for your solutions, if anyone has written a better set of actions for manipulating this UI.
Here's the test code in its entirety, when it's all put together:
import io.appium.java_client.AppiumDriver;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.openqa.selenium.Point;
import org.openqa.selenium.Rectangle;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.interactions.Pause;
import org.openqa.selenium.interactions.PointerInput;
import org.openqa.selenium.interactions.Sequence;
import org.openqa.selenium.remote.DesiredCapabilities;
import java.io.IOException;
import java.net.URL;
import java.time.Duration;
import java.util.Arrays;
import java.util.Collection;
public class Edition067_Zoom_Touch_Gestures {
private AppiumDriver driver;
@Before
public void setUp() throws IOException {
DesiredCapabilities caps = new DesiredCapabilities();
caps.setCapability("platformName", "Android");
caps.setCapability("deviceName", "Android Emulator");
caps.setCapability("automationName", "UiAutomator2");
caps.setCapability("appPackage", "com.google.android.apps.maps");
caps.setCapability("appActivity", "com.google.android.maps.MapsActivity");
caps.setCapability("intentAction", "android.intent.action.VIEW");
caps.setCapability("optionalIntentArguments", "-d geo:46.457398,-119.407305");
driver = new AppiumDriver(new URL("http://localhost:4723/wd/hub"), caps);
}
@After
public void tearDown() {
try {
driver.quit();
} catch (Exception ign) {}
}
/*
locus: the center of the touch gesture, the point that fingers are pinching away from or towards. They won't actually touch this point though
startRadius: distance from center that fingers begin at
endRadius: distance from center that fingers end at
pinchAngle: at what angle the fingers pinch around the locus, in degrees. 0 for vertical pinch, 90 for horizontal pinch
duration: the total amount of time the pinch gesture will take
*/
private Collection<Sequence> zoom(Point locus, int startRadius, int endRadius, int pinchAngle, Duration duration) {
// convert degree angle into radians. 0/360 is top (12 O'clock).
double angle = Math.PI / 2 - (2 * Math.PI / 360 * pinchAngle);
// create the gesture for one finger
Sequence fingerAPath = zoomSinglefinger("fingerA", locus, startRadius, endRadius, angle, duration);
// flip the angle around to the other side of the locus and get the gesture for the second finger
angle = angle + Math.PI;
Sequence fingerBPath = zoomSinglefinger("fingerB", locus, startRadius, endRadius, angle, duration);
return Arrays.asList(fingerAPath, fingerBPath);
}
/*
Used by the `zoom` method, for creating one half of a zooming pinch gesture.
This will return the tough gesture for a single finger, to be put together with
another finger action to complete the gesture.
fingerName: name of this input finger for the gesture. Used by automation system to tell inputs apart
locus: the center of the touch gesture, the point that fingers are pinching away from or towards. They won't actually touch this point though
startRadius: distance from center that fingers begin at
endRadius: distance from center that fingers end at
angle: at what angle the fingers pinch around the locus, in radians.
duration: the total amount of time the pinch gesture will take
*/
private Sequence zoomSinglefinger(String fingerName, Point locus, int startRadius, int endRadius, double angle, Duration duration) {
PointerInput finger = new PointerInput(PointerInput.Kind.TOUCH, fingerName);
Sequence fingerPath = new Sequence(finger, 0);
double midpointRadius = startRadius + (endRadius > startRadius ? 1 : -1) * 20;
// find coordinates for starting point of action (converting from polar coordinates to cartesian)
int fingerStartx = (int)Math.floor(locus.x + startRadius * Math.cos(angle));
int fingerStarty = (int)Math.floor(locus.y - startRadius * Math.sin(angle));
// find coordinates for first point that pingers move quickly to
int fingerMidx = (int)Math.floor(locus.x + (midpointRadius * Math.cos(angle)));
int fingerMidy = (int)Math.floor(locus.y - (midpointRadius * Math.sin(angle)));
// find coordinates for ending point of action (converting from polar coordinates to cartesian)
int fingerEndx = (int)Math.floor(locus.x + endRadius * Math.cos(angle));
int fingerEndy = (int)Math.floor(locus.y - endRadius * Math.sin(angle));
// move finger into start position
fingerPath.addAction(finger.createPointerMove(Duration.ZERO, PointerInput.Origin.viewport(), fingerStartx, fingerStarty));
// finger comes down into contact with screen
fingerPath.addAction(finger.createPointerDown(PointerInput.MouseButton.LEFT.asArg()));
// finger moves a small amount very quickly
fingerPath.addAction(finger.createPointerMove(Duration.ofMillis(1), PointerInput.Origin.viewport(), fingerMidx, fingerMidy));
// pause for a little bit
fingerPath.addAction(new Pause(finger, Duration.ofMillis(100)));
// finger moves to end position
fingerPath.addAction(finger.createPointerMove(duration, PointerInput.Origin.viewport(), fingerEndx, fingerEndy));
// finger lets up, off the screen
fingerPath.addAction(finger.createPointerUp(PointerInput.MouseButton.LEFT.asArg()));
return fingerPath;
}
/*
Simplified method for zooming in.
Defaults to a 45 degree angle for the pinch gesture.
Defaults to a duration of half a second
Fingers start 50px from locus
locus: the center of the pinch action, fingers move away from here
distance: how far fingers move outwards, starting 100px from the locus
*/
private Collection<Sequence> zoomIn(Point locus, int distance) {
return zoom(locus, 200, 200 + distance, 45, Duration.ofMillis(25));
}
/*
Simplified method for zooming out.
Defaults to a 45 degree angle for the pinch gesture.
Defaults to a duration of half a second
Fingers finish 50px from locus
locus: the center of the pinch action, fingers move towards here
distance: how far fingers move inwards, they will end 100px from the locus
*/
private Collection<Sequence> zoomOut(Point locus, int distance) {
return zoom(locus, 200 + distance, 200, 45, Duration.ofMillis(25));
}
@Test
public void ZoomInAndOut() throws InterruptedException {
// tap center to dismiss toolbars
WebElement map = driver.findElementById("com.google.android.apps.maps:id/mainmap_container");
map.click();
Rectangle mapCoordinates = map.getRect();
Point center = getCenter(mapCoordinates);
driver.perform(zoomOut(center, 450));
Thread.sleep(1000);
driver.perform(zoomIn(center, 450));
Thread.sleep(1000);
driver.perform(zoomOut(center.moveBy(0, 250), 300));
Thread.sleep(1000);
driver.perform(zoomIn(center.moveBy(0, -250), 300));
Thread.sleep(3000);
}
private Point getCenter(Rectangle rect) {
return new Point(rect.x + rect.getWidth() / 2, rect.y + rect.getHeight() / 2);
}
}
That's all for this week. The example can be found with all our example code on Github.