Using the OpenAI platform to analyse automated test failures

Introduction

When it comes to AI and especially the OpenAI platform, there's no shortage of content around about how it will affect everything. So at a first glance, this article might seem like another one of those overly enthusiastic and optimistic clickbait puff pieces about how you should get on the AI train or risk being left behind.

By the way, the article is not written by AI. I’m just using my favourite text editor app, named after one of the most exhilarating novels in the Western literature – Ulysses. There is no interference (or inference) from the outside, other than some basic autocomplete functionality. There is no AI-generate fluff, although I can’t promise you that the article doesn’t contain fluff at all.

But regardless on where you stand on AI-generated content, as software professionals, I think we can all agree that when it comes to automated software testing, debugging and investigating test failures is always tedious. So I thought that this could be a good area where we can incorporate some AI assistance because we’re just extending the work which is already done by a machine. There’s no risk here of impersonating humans or “fooling people into thinking they are interacting with a real person”, which philosopher Daniel Dennett presents as a real civilisational risk in his recent piece in The Atlantic.

What is end-to-end testing?

If you’re new to end-to-end testing, it is a type of automated software testing in which the entire application is being tested by simulating actions which a real user would do.

Nightwatch.js is an open-source library for writing and executing automated and end-to-end tests for websites and web applications. It was published in 2014 and in 2021 it was transferred to the open-source program office at BrowserStack, where it is being developed currently. Written in Node.js, it has support for all major web browsers and also it can run tests on mobile devices as well.

This tutorial will look at how to develop a Nightwatch.js plugin which sends the test failure and associated errors to a service which integrates with the OpenAI platform to analyse the errors and get some actionable feedback. By default, in recent versions Nightwatch already has pretty good feedback for test failures and provides some degree of actionable feedback, so we’ll attempt to extend that by using the GPT-4 model to add a bit more flair to the output messages, provide a slightly better context and learn how to develop services that incorporate the assistance of AI.

Why Nightwatch?

Granted, there are a few other testing tools out there which at the moment might enjoy a bit more hype and popularity, but actually Nightwatch is the project we created here at Pineview back in 2014 and is now being developed at BrowserStack’s open-source program office. I am also part of that team and Nightwatch is still my favourite tool which I use for testing in all my other projects, of course.

In addition to that, as a library Nightwatch has been around for quite a while enjoying various degrees of popularity over the years. There’s been plenty of data available for ML model training, so GPT-4 has a pretty good command of writing tests in Nightwatch and interpreting results, which means that we already have a strong basis for building an AI-assisted service to interpret our test failures. And confront us with them maybe.

Step 1 – create the error analysis service

Our little exercise here consists of mainly two parts, both of them relatively straightforward:

building the backend service which makes the call to the OpenAI service
writing the Nightwatch.js plugin which will receive the actual test failure and send it to the backend service for analysis

We’ll start with part 1. – building the error analysis service. Building anything related to AI might sound these days as very extravagant and glamorous but it’s actually a very basic task which hasn't much special about it.

The analysis service is only a basic express.js API service which accepts POST requests and makes a specific call to the OpenAI platform using the SDK for Node.js.

You’ll need to get yourself a developer key from OpenAI here and then configure a model to be used. For the purpose of this article, I used gpt-4-1106-preview but that requires a paid plan. If you’d like to try it out on a free plan, you can use the gpt-3.5-turbo.

1.1 Project structure

Set up the new project with:

mkdir nightwatch-openai-service
cd nightwatch-openai-service
touch index.js
npm init -y

Next, edit the package.json file and set type=module, e.g.:

{
  "name": "openai-nightwatch-service",
  "type": "module",
  ...
}

Then go ahead and install the required dependencies:

npm i dotenv express openai

1.2 Add the service

In the newly created project, create two new files:

.env – should contain the openai api key, e.g.:

OPENAI_API_KEY=xxxxxx
PORT=4001

index.js – paste the code below

import dotenv from 'dotenv';
import express from 'express';
import { OpenAI } from 'openai';

dotenv.config();
const app = express();
app.use(express.json());
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

app.post('/analyze-error', async (req, res) => {
  try {
    const { errorMessage, codeSnippet, additionalDetails } = req.body;

	const details = `Additional details: Nightwatch version: ${additionalDetails.nightwatchVersion}, config file: ${additionalDetails.configFile}, platform: ${additionalDetails.platform}, browser: ${additionalDetails.browser}, headless mode: ${additionalDetails.headless}.`;
    const messages = [
      {
        role: "system",
        content: "You are an expert in web development using Node.js, automated testing with Selenium WebDriver, and the Nightwatch.js framework."
      },
      {
        role: "user",
        content: [
          {
            type: "text",
            text: `Investigate and explain why the tests failed. Error message: ${errorMessage}\n.Code snippet from test case where the error occurred: ${codeSnippet}. ${details}`
          }
        ]
      }
    ];

    const response = await openai.chat.completions.create({
      model: "gpt-4-1106-preview",
      messages,
      max_tokens: 600,
    });

    res.json({ analyzedResult: response.choices[0].message.content });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => console.log(`Server running on port ${PORT}`));

That’s about it for the service. Simply run it with:

node index.js

As you can see, there’s very little creativity involved. We will be sending in the error message from the Nightwatch plugin, along with a small code snippet which contains the line of code where the error or assertion failure occurred.

The only thing which remains is to thinker with the prompt. There’s an entire section on the OpenAI docs called prompt engineering – how to write better prompts to improve your results, which is where we are now when it comes with product innovation.

The code is available on Github. Go ahead and fork it and run it locally, we’ll need it for the next part of the tutorial.

Step 2 – write the Nightwatch.js reporting plugin

Besides the built-in test reporters which are included by default (junit-xml, json, html), Nightwatch supports functionality to load custom reporters, which is what we’re going to develop next.

The complete code is available on Github and the package is published in NPM as nightwatch-openai-plugin, so you can actually use it directly and skip ahead to Step 3 if you prefer.

The role of the custom reporter plugin is to send the error data to the AI-assisted analysis service which we developed at Step 1. To do that, we need to create a new Node.js project that follows a specific structure which Nightwatch will understand.

2.1 Project structure

First, set up the new project with:

mkdir my-nightwatch-ai-reporter
cd my-nightwatch-ai-reporter
touch index.js
npm init -y
git init

Basically, the plugin needs to be wrapped as an NPM package and export a module that looks like below:

// index.js

module.exports = {
  async reporter(results) {
console.log('do some reporting here...');
  }
}

We also need to add an .env file which will populate with the url for our AI Analysis service.

If you're running the service as per Step 1 of this article, the .env file will look like so:

SERVICE_URL=http://localhost:4001/analyze-error

2.2 Write the custom reporter

Now we just have to populate the reporter() function inside the index.jsfile with some logic to send the reports to Analysis service from step 1 and display the results. The Analysis service will call the OpenAI platform with the prompt we have defined.

When the test run finishes, Nightwatch calls the reporter with the results argument, which contains the failure results and any other associated errors. Here’s how it will look like:

module.exports = {
  async reporter(results) {

    const errors = getErrorMessages(results);

    if (!errors) {
      return;
    }

    const outputs = makeOutputs(errors);

    for (const output of outputs) {
      try {
        const response = sendErrorAnalysisRequest(output);
        const terminalOutput = marked.parse(response.data.analyzedResult);
        console.log('Error analysis completed:', terminalOutput);
      } catch (err) {
        console.error('Error analysis failed:', error.response?.data || error.message);
      }
    }
  }
}

The sendErrorAnalysisRequest simply will issue a POST request with the test data.

💡

Another important thing to note is that the reporter will issue a request for each failed test suite and display the result, so in real-life cases with a large volume of tests this might not be scalable and further optimisation improvements will have to be made, however those are beyond the scope of this tutorial.

Step 3 – putting it all together

Now that we have the plugin and the service, it’s time to put them together in a test project. We’re going to build a small end-to-end testing project which contains a few very basic tests for an example website. The project will use Nightwatch to run the tests and our newly created plugin.

3.1 Setup a test project

First, create a test project with the commands below:

mkdir nightwatch-testing
cd nightwatch-testing
npm init -y

3.2 Install nightwatch from NPM

Nightwatch can be installed from NPM with simply running the command below and it is ready to go:

npm i nightwatch

You can verify if nightwatch has been installed with:

npx nightwatch --info

3.3 Add the Nightwatch reporting plugin

Now it’s t time to add the AI analysis plugin which we’ve developed at Step 2 to our test project, so Nightwatch can discover it and use it.

You can either install the package directly from NPM or you can use the local version if you have completed Step 2 in full.

To install it from NPM run:
npm i nightwatch-openai-plugin
To install it from your local folder (update the path accordingly, relative locations also work):
npm i /path/to/my-nightwatch-ai-reporter

3.4 Configure Nightwatch to load the plugin

In order for Nightwatch to be able to load the plugin, we need to define it in the nightwatch config file – nightwatch.conf.js.

First, let’s examine the package.json file. It should have the plugin it the list of dependencies. Assuming the plugin was installed from NPM, it will look like so:

{
  "name": "nightwatch-testing",
  ...
  "dependencies": {
    "nightwatch": "^3.3.1",
	"nightwatch-openai-plugin": "^0.1.0"
  }
}

Now open the nightwatch.conf.js file and add nightwatch-openai-plugin to the plugins array, as below:

// nightwatch.conf.js

module.exports = {
  // ... other settings
  plugins: ['nightwatch-openai-plugin'],
  // continued...  
}

You can verify that Nightwatch was installed and working properly by running an example test which is bundled with the library, using Chrome:

npx nightwatch examples/tests/duckDuckGo.js --chrome

The output will look like:

ℹ Connected to ChromeDriver on port 9515 (1001ms).
Using: chrome (119.0.6045.123) on MAC.


  Running Search Nightwatch.js and check results:
───────────────────────────────────────────────────────────────
  ✔ Element <body> was visible after 15 milliseconds.
  ✔ Testing if element <input[name="q"]> is visible (17ms)
  ✔ Testing if element <button[type=submit]> is visible (14ms)
  ✔ Testing if element <.react-results--main> contains text 'Nightwatch.js' (1545ms)

  ✨ PASSED. 4 assertions. (2.534s)

You can also specify --firefox, --safari, or --edge depending on which of these browsers you have installed on your machine.

Step 4 – run the tests and check the analysis report

Now that we have Nightwatch installed and configured using the plugin for the AI-assisted analysis service we developed at Step 1, we can now run some more end-to-end tests and see it in action.

If you haven’t completed the Step 1 or if you want to get a quick bite before you start, I have prepared an example project for you, including a demo backend service which is up and running so you can see it in action:

Feel free to fork it and run it on your local machine. Please do note that the Analysis service is running in a limited capacity for Demo purposes and should not be used in real-life testing scenarios.

4.1 Adding some end-to-end tests

For those of you who have completed the previous steps and are neck deep in it, we just need to add a few basic tests so we can run everything locally.

Go to the nightwatch-testing folder and create a new test folder:

mkdir test

Then add these 2 tests inside the test folder:

1) homepage.js

describe('Homepage End-to-end Test', () => {

  it('tests if homepage is loaded', browser => {
    browser
      .navigateTo('https://middlemarch.netlify.app/')
      .assert.visible('#app .new-arrivals-panel')
      .expect.elements('#app .new-arrivals-panel .col-md-6').count.toEqual(4)
  });
  
});

2) addtocart.js

describe('add to cart test', () => {

  before(browser => browser.navigateTo('https://middlemarch.netlify.app/'));

  it('adds 2 volumes of "Rhinoceros and Other Plays" to cart', browser => {
    const addToCartEl = browser.element.findByText('Rhinoceros and Other Plays').getParentElement().find('button');
    addToCartEl.click()
    addToCartEl.click()

    browser.assert.textEquals('.shopping-cart .badge', '2');
  });

  after(browser => browser.end());
});

Both tests are written against an example bookstore app, part of an earlier tutorial I wrote here about Vite and Vue 3. The first test simply opens up the website and verifies if the content is there, while the second test adds a book to cart and performs a basic assertion.

To run the tests, use the command as below, optionally passing the --headless argument if you don’t like to see the browser popping up during the tests:

npx nightwatch test --chrome

Or use --firefox, --safari, or --edge depending on which browsers you have available on your machine.

4.2 Intentionally fail the tests

In order to test the AI Analysis service we’ll need to intentionally fail at least one of the tests. Then the plugin reporter will come into effect and send the test failure to the backend service, and then print the result.

Thankfully there are many available options to fail a test. One of the simplest ways is to rename one of the elements and then wait for the test to fail with an “element not found” error.

Simulate “element not found” errors

Edit the homepage.js file from the test folder and simply change the line that stats with .assert.visible to something like:

browser
 .navigateTo('https://middlemarch.netlify.app/')
 .assert.visible('#xapp')

Then the test will simply fail with an error message that the element with the selector #xapp could not be found and will print something similar to the following output:

TEST FAILURE (12.844s):  
 - 1 assertions failed; 1 passed

✖ 1) homepage

 – tests if homepage is loaded (7.903s)

 → ✖ NightwatchAssertError
 Testing if element <#xapp .new-arrivals-panel> is visible in 5000ms - expected "is visible" but got: "element could not be located" (5131ms)

    Error location:
    /Users/andrei/pineviewlabs/nightwatch-openai/test/homepage.js:6
    –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
     4 |     browser
     5 |       .navigateTo('https://middlemarch.netlify.app/')
     6 |       .assert.visible('#xapp .new-arrivals-panel') 
     7 |       .expect.elements('#app .new-arrivals-panel .col-md-6').count.toEqual(4)
     8 |   });
    –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––

Then we can actually see the report from the AI Analysis service:

Error analysis completed: The failure indicates that the element with CSS selector #xapp .new-arrivals-panel was not found within the time frame of 5000ms (5 seconds). Here's what you can do to debug this:

    1. Update the test code:

    .debug({selector: "#xapp .new-arrivals-panel"}) // Add this line
    .browser.expect.element('#xapp .new-arrivals-panel').to.be.visible.before(5000);

    1. Run Nightwatch with the flags for debugging:

    nightwatch --debug --devtools

This will open the Chrome Developer Tools where you can inspect the page and the console.

Probable reasons for the error could include:

    * The element does not exist on the page at the time of testing.
    * The element is not visible within the 5 seconds because the page hasn't finished loading, there are network delays, or JavaScript that displays the element runs late.
    * The selector is incorrect or has changed.
    * There's an error in the page's JavaScript that prevents the element from being displayed correctly.

The report might be a bit too long and overly generic, but it is now only a matter of tuning the prompt so it generates desired results, which is not a task for this article.

4.3 Configure the analysis backend service

The nightwatch-openai-plugin uses a default HTTP API service to interact with the OpenAI API, provided for demo purposes. You can host your own service by cloning the openai-nightwatch-service repository and running it with your own OpenAI API key.

When running the openai-nightwatch-service, you need to define the NIGHTWATCH_ANALYSIS_SERVICE_URL environment variable to point to the service URL. You can also use .env files.

For example, assuming you have the service running at http://localhost:4001, you can create a .env file with the following content in the root of your Nightwatch project:

NIGHTWATCH_ANALYSIS_SERVICE_URL=http://localhost:4001/analyze-error

Conclusion

So there you have it, we have managed to build (I hope) an AI-assisted analysis plugin for Nightwatch tests and also we’ve seen it in action. You can simulate now try to simulate all kinds of errors and see what the response it is.

Remember this is just an experiment and the service is only available for Demo purposes. I haven’t attempted to try out different types of errors and test failures yet and I haven’t tuned the prompt for too long so I can’t guarantee it will scale up to a large collection of tests where you might have different types of failures. So you’ll have to try this at your own risk, but you’re welcome to perform your own experiments and report back your findings. Thanks for reading.

Pineview Labs