A framework for autonomous testing of modern SaaS and digital applications

Autonomous Testing: Its Importance and Benefits

The emergence of modern web frameworks, microservices, and cloud architectures requires testers to generate and maintain test cases, scripts, and data in short agile sprints, sometimes in as little as a few hours. It is impossible to produce these assets in a comprehensive manner using legacy DevOps approaches that largely depend on the subjective judgement of test engineers.

Autonomous testing is the automated construction of software test cases and test suites with limited to no contribution from a human tester. Computers can write tests much faster than humans, and this increased testing speed drives the benefits of autonomous test generation. The first benefit is the ability to quickly create tests for new application functionality. In rapid development cycles, testing speed determines how quickly a new feature can be deployed into production and impacts the speed of application updates.

Sometimes new features are pushed into use before they are rigorously tested, a situation where rapid case generation can reduce risk and increase confidence. A second benefit of autonomous testing is a reduction in cost, since large numbers of tests can be created automatically with limited input from an expensive human tester.

The final advantage also derives from this ability to generate large numbers of tests: better coverage. Autonomous test suites can cover a broader range of inputs and application functionality because it is possible to examine a wider range of test situations and input values than manually created tests.

Introduction to Autonomous Testing

Virtually every system, process, business and industry is getting disrupted by software. As more people are coming to depend on code, the velocity, volume and variability of how software is developed has changed considerably. However, frameworks and approaches available to test and manage software have not evolved to meet the demands of software coding velocity, volume and variability.

One approach to addressing this challenge is to combine automation and artificial intelligence -”autonomous” testing. However, initial experiments with this approach led to significant challenges, due to explosion of computational states as well as the challenge to provide guarantees that ensure relevant scenarios are generated and executed against
the code being tested. Previous approaches to tackling this challenge include formulating test case generation as a many-objective optimization problem^[1]and generating test cases to repair test-suite over-fitting^[2]. However, these both operate at the source code level which can be impractical and insufficient for manual testers, automation engineers and business analysts.

[1] A. Panichella, F. M. Kifetew, and P. Tonella, “Automated test case generation as a many-objective optimisation problem with dynamic selection of the targets,” IEEE Transactions on Software Engineering, vol. 44, no. 2, pp. 122–158, Feb 2018.

[2] Q. Xin and S. P. Reiss, “Identifying test-suite-overfitted patches through test case generation,” in Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA 2017. New York, NY, USA: ACM, 2017, pp. 226–236.[Online]. Available: http://doi.acm.org/10.1145/3092703.3092718

Autonomous Testing - Two approaches: model-free and model-based

There are two broad approaches to autonomous testing: model-free and model-based. In the model-free approach, test cases are generated using limited to no knowledge of the internal structure and user interface flow of an application. This is pure black box testing and involves automatic generation of test inputs. In the model-based approach, a model is developed of the application’s internals. A common model is a finite state machine which can represent a sequence of screens in an application, or different internal operational states. This model is then used to drive the creation of test cases, which are able to better exercise application functionality by using information contained in the model. This often involves picking a path through the nodes of a finite state machine which models the application under test.

Fuzzing is the primary model-free testing approach. While fuzzing techniques are very sophisticated today at its heart fuzzing tries to break software by sending it random inputs. American Fuzzy Lop(AFL) is a state-of-the-art fuzzing tool which initially starts with no internal knowledge of the application being tested. AFL generates a set of inputs and then observes their impact on the behavior of the program via examination of its in-memory image. A genetic algorithm selects those inputs which generate new paths in the application, and periodically replaces older inputs with better, newer ones. Over time, it is able to test increasingly large portions of the application, and can detect crash bugs of various kinds. Since the goal of AFL is to increase the total coverage of internal program paths, it is an example of coverage-guided fuzzing. While AFL works best when it has access to C/C++ source code, it is also able to work with straight binaries. It is not well suited to testing web applications where the application is a combination of HTML+CSS+JS.

For web application testing, the OWASP ZAP proxy exemplifies fuzzing that occurs via network protocol interfaces and transmitted file formats, an approach known as protocol fuzzing. In this approach, random protocol input strings are sent to a web application to ensure that neither it, nor its underlying web server, have any exposed vulnerabilities. While the approach is different, the underlying idea is the same: trying to break applications by submitting random strings via external interfaces. For an introduction to fuzzing, check out The FuzzingBook by Andreas Zeller and his colleagues Rahul Gopinath, Marcel Böhme, Gordon Fraser, and Christian Holler.

Have model, will test: model-based testing

In the model-based approach, the first step is to create a model of the application under test. Information in the model is then used to create the sequence of operations that tests an application. The figure below shows a finite state machine model of a sample eCommerce site, where the boxes represent different pages or modal interfaces (or, more abstractly, different states of the application) and the arcs are actions taken on the page. For example, going from the homepage to a product page involves the action of clicking on the image for a specific product. An examination of the model shows there are many possible paths to take depending on when the user authenticates using the sign-in modal. A model-based testing framework can determine the possible paths through the model and then generate test cases.

One example path and associated actions is: home page →(click on a product image)→ product page →(click add to cart)→shopping cart →(click sign-in button)→ sign-in modal →(enter user credentials)→ shopping cart→(click order button)→ checkout page. The testing framework can then either execute these steps directly or generate source code which can then be saved and used as part of a broader test suite.

Figure: A finite state machine model of a sample web application

Where do models come from?

Once a model of an application has been created, model-based testing works great: a large number of test cases can be generated autonomously. But, where do these models come from?

Typically, a highly trained software test engineer needs to create these models by hand. Further, as the application changes the models need to be updated, a tedious manual process. The key challenge in model-based autonomous testing is automatic construction of high-quality application models. These models are created via exploration (crawling) of a web or mobile application.

A variety of approaches have been used to explore applications and create useful models for software testing. In the Crawljax approach (described in “Crawling Ajax-Based Web Applications through Dynamic Analysis of User Interface State Changes”, ACM Transactions on the Web, 6(1), March 2012) a web application is crawled via examination of its DOM and then firing events on DOM elements that are capable of changing application state. A graph of different application states is maintained and is used to drive exploration. The resulting state flow graph can be used to create test cases, thereby supporting autonomous testing.

Recent efforts have focused on using reinforcement learning to explore an application under test. ARES is an open source project which uses reinforcement learning to explore and test an Android application. It supports a pluggable architecture which supports different reinforcement learning methods, including deep learning algorithms Deep Deterministic Policy Gradient (DDPG), Soft Actor Critic (SAC), and Twin Delayed DDPG(TD3), provided by the Python Stable Baselines library. A paper evaluating ARES by Andrea Romdhana, Alessio Merlo, Mariano Ceccato, and Paolo Tonella “Deep Reinforcement Learningfor Black-Box Testing of Android Apps” explores several reinforcement learning approaches for Android application testing and finds that DDPG andSAC algorithms tend to work best, though the highest performing algorithm varies by application.

Test paths need test verifications

Once an application model exists and test paths are being automatically generated there remains the question of how to automatically create test verification conditions -- does the test pass or fail? This is a challenging problem as it requires automated inference of application-specific behavior. One approach is to make use of visual test capability such as that provided by SauceLabs automated visual testing. The first time an automated test is executed a snapshot of the user interface can be taken at points along the test path. Then in future executions, any visual deviation can indicate a possible error since the visual output has changed.

A deeper approach would be to automatically infer invariant conditions that exist on a page in an application. For example, inferring that a particular row in a table always contains the sum of all preceding rows would allow a verification condition to be generated which checks this condition. Alternately, it would be possible to verify that a string entered on one page returns in the same form on subsequent pages. To date, there has been limited work on inferring application page invariant conditions; this topic was explored somewhat by Mesbah, van Deursen and Roest in “Invariant-Based Automatic Testing of Modern Web Applications” (IEEE Trans. on SoftwareEngineering, 38(1), Jan-Feb, 2012).

Autonomy Ahead

With so many research and open source projects focused on autonomous testing, the future is bright for increased levels of automatic test generation and execution for testing web and mobile applications. Multiple projects have demonstrated the ability to perform automated test generation, and so now the challenge is to take these technologies out of the lab, and scale them up for use on industrial applications. Existing projects like OWASP ZAP highlight the utility of model-free autonomous testing, and so now it is time to shift focus to model-based testing and its ability to test a much broader range of application features.

Albert Tort

Jim Whitehead

Jim Whitehead is Chief Scientist with Sauce Labs, and a Professor of Computational Media with the University of California, Santa Cruz. He brings over 25 years of experience as a software engineering and artificial intelligence researcher to his role at Sauce Labs. In software engineering, he performed early research using machine learning techniques to predict whether commits are buggy (just in time bug prediction). His computer games research focuses on artificial intelligence techniques for creating computer game content (procedural content generation). The unique synergies between computer games and software engineering research drive many research insights at Sauce Labs.

Albert Tort

Ram Shanmugam

Ram Shanmugam currently leads the low code automation business at Sauce Labs. Previously he was the founder and CEO of AutonomIQ, an AI-based low-code automation platform that was acquired by Sauce Labs. Before founding AutonomIQ, Ram was the co-founder, CEO and President of appOrbit, a venture backed leader in Kubernetes and Cloud orchestration and has held technology and product leadership roles at technology companies such as Cisco, HP, and SunGard. In addition to his professional experience, Ram is an active IEEE contributor in the area of AI and machine learning research and recognized as a technology pioneer by the World Economic Forum.

About Sauce Labs

Sauce Labs is the company enterprises trust to deliver digital confidence. More than 3 billion tests have been run on the Sauce Labs Continuous Testing Cloud, the most comprehensive and trusted testing platform in the world.

Sauce Labs delivers a 360-degree view of a customer’s application experience, helping businesses improve the quality of their user experience by ensuring that web and mobile applications look, function, and perform exactly as they should on every browser, OS, and device, every single time.

Sauce Labs enables organizations to increase revenue and grow their digital business by creating new routes to market, protecting their brand from the risks of a poor user experience, and delivering better products to market, faster.

Visit us at saucelabs.com

Cookies	Description
Registered visitor cookie	Cookie given to each registered user.
Registered visitor functionality cookie	Cookies used to remember the unique identifier given to each registered user.
Social plug-in content sharing cookie	Cookies set by services such as Facebook Connect or Twitter Button, which allow social networks users to share the content of our websites on social networks.
Unregistered visitor cookie	Cookies used to give to unregistered users a unique identifier in order to recognize them and to analyze how they use the website.
Analytic cookie	Cookies used to store URLs of the previous page visited, enabling to track users navigating from inside or from outside the website. If you click on a Sogeti advertisement on a non-Sogeti website, a cookie may be used to log which website you are on, in order to ensure our advertisements are served effectively and to measure whether our advertisements are viewed. Google Analytics: cookies set by Google analytics are used for web analytical purpose, but are not used to track individual users. For further information on how Google Analytics collects and uses information on our behalf and the right to use such cookies, please refer to the Google Analytics products and services privacy statement. If you object to your Personal Data being collected by Google Analytics, you may download and install the Google Analytics Opt-out Browser Add-on. Pardot: cookies set by Pardot are used to track users on our website. Visits are tracked for known users only. Unknown users are recorded as anonymous users. Please refer to Pardot privacy policy for any further information on their use and your rights related to the use of such cookies.

A framework for autonomous testing of modern SaaS and digital applications

Download the "Section 4.2: Automate & Scale" as a PDF

Use the site navigation to visit other sections and download further PDF content

About the author

About Sauce Labs