AI-Powered Selenium Automation

While working as a manual tester, I encountered a significant challenge in the realm of web application testing. The manual Smoke and Regression test cases proved to be a time-consuming and repetitive task for our team. At the same time, I struggled to master programming concepts to automate these tests efficiently.

Exploring the Power of GPT-4

Driven by curiosity, I explored how OpenAI GPT-4 model could be used to develop web automation test suites. I familiarized myself with the GPT-4 model and most importantly, with Prompt Engineering. This held the key to bridging the gap between my limited coding skills and the efficient automation I sought.

Setting Up the Tools

These were my tools: Visual Studio Code as my trusted code editor, Selenium WebDriver to interact with web elements, JavaScript as my programming language of choice, and the Mocha Framework for streamlined test organization and execution.
I needed to develop a strong understanding of manually locating and identifying HTML elements in web pages.

A Step-by-Step Guide

To illustrate my journey, I chose a simple yet common scenario: writing automation for Google Search. I crafted a prompt for GPT-4, outlining the desired steps.

System Prompt:

You are an AI assistant and an expert at generating the Selenium test cases using Javascript & Selenium-web driver for different user scenarios.

User Prompt:

Write a selenium test using Javascript and Selenium-web driver.

Perform the following actions:

Step 1: Open https://www.google.com/
Step 2: Find the element that contains the tag textarea having attribute id that contains "APjFqb", click on it, and enter the text as "QA automation".
Step 3: Find the element that contains the input tag having the attribute value "Google Search" and click on it.

The code should run using the mocha framework having a describe function to perform above mentioned actions in a single test case. Make sure the code is properly commented.

Make sure you only reply with the javascript code and nothing else.

In response, OpenAI GPT-4 model returned the Javascript and Selenium web driver completion code.

The completion code from GPT-4 was truly remarkable! The code was structured and well-commented, covering every step of the automation process. It started with opening Google.com, followed by locating the search bar element, entering the search text, and finally clicking the search button. GPT-4 actually transformed my prompt into a fully functional automation script.

Executing the Automation Script

I opened Visual Studio Code, created a folder named QA Automation to keep my files organized, and within that folder, I created a JavaScript file named googleSearch.js With the help of package managers like npm or yarn, I installed the required libraries. Finally, I copied the completion code into the 'googleSearch.js' file.
As I ran my automation script in Visual Studio Code, it was pretty awesome to see the flawless execution of each step.

Automating Jam’s Manual Test Cases

Using the aforementioned technique, I have successfully converted over 100 web smoke & regression test cases in an automation test suite for one of our client’s projects, Jam.so. The implementation of the automation test suite has significantly reduced the time and manual efforts required to run smoke and regression tests, resulting in increased productivity.

Conclusion

This blog explored how AI can help people with limited coding skills with QA automation. By leveraging GPT-4’s prompt and completion statements, we bridge the gap between coding expertise and test automation. It allows us to save time and increase productivity.