Test Data Management Strategies

Hello all,

Today I am going to talk about some different approaches to handle your test data when running automated tests and the trade-offs.

 

Database

Injecting the data before running the tests with SQL, mysql or postgresql scripts are one of the most common approaches. So, you can inject the data you will need for the tests and skip all the setup, which is not the goal of all your scenarios, right ?

For the scenarios that you actually need to test the creation of the data then you won’t use this kind of script. For example in javascript, you would add a setup/data management class, a @BeforeAll and then something like this:

var mysql = require('mysql');
var con = mysql.createConnection({
     host: "localhost",
     user: "root",
     password: "12345",
     database: "javatpoint"
});  

con.connect(function(err) {
     if (err) throw err;
       console.log("Connected!");
       var sql = "INSERT INTO employees (id, name, age, city) VALUES ('1', 'Ajeet Kumar', '27', 'Allahabad')"; 
       con.query(sql, function (err, result) {
     if (err) throw err;
       console.log("1 record inserted");  
     });
});

Then you can have a @TearDown, @AfterAll function to delete the data that was created to be used during the tests.

Files

If, for example, you are running some API tests you might want to have static data ready to be injected for each scenario. You can create a json file and add all the fields and values that are going to be used during your automation:

 { 
   name: "John", 
   age: 31, 
   city: "New York" 
},
{
   name: "Rafa", 
   age: 29, 
   city: "London" 
}

Then you can load this file to be used during your tests. You can create this data upfront, but then you need to make sure that this data is always going to be there otherwise you need to create it again (during your tests or manually).

 

Objects

You can create Objects with the data that you are going to need for the automated tests, so for example you can create a dictionary in Javascript:

var dict = {
  FirstName: "Rafa",
  Age: 30,
  Country: "UK"
};

Then again you need to make sure you are going to create this data during runtime, maybe in a @BeforeAll function or a Setup class, or maybe this is something you have created in the environment already and you need to make sure this is going to be there when running the tests, otherwise you need to create it again.

 

Docker

If you can control the database or the deployment of your QA environment, then it means you can also manipulate the database when running the tests.

If you use docker to create the environment you can add a Volume or even seed the database with docker-compose.

Volume

Volumes are often a better choice than persisting data in a container’s writable layer because a volume does not increase the size of the containers using it, and the volume’s contents exist outside the lifecycle of a given container.

You can push the database (json file, .db) entirely to the docker container:

 docker run -it --name my-directory-test -v /hostvolume:/containervolume centos /bin/bash

Seed

Write a small script that generates randomized and varying data and writes it to the database. Then you can wrap this script into your own Docker image in order to execute them automatically via docker-compose.

 

In this example I am using a mongoDB database:

docker-compose.yml

version: '1.0'

services:

  mongodb:
    image: mongo
    container_name: mongo
    ports:
      - 27017:27017


  mongo-seed:
    build: .
    environment:
      - MONGODB_HOST=mongo
      - MONGODB_PORT=27017
    volumes:
      - ./config/db-seed:/data
    depends_on:
      - mongo
    command: [
      "mongoimport --host mongo --port 27017 --db testautomation --mode upsert --type json --file data.json --jsonArray"
      ]

data.json

[
  {
    "name": "Peter Parker",
    "email": "spiderman@gmail.com",
    "age": 28
  },
  {
    "name": "Bruce Wayne",
    "email": "batman@gmail.com",
    "age": 48
   }
]

 

Scenarios

If you are working with Gherkin syntax, it means you can also add the data in the middle of the scenario and then use it during the automation. So, something like:

Scenario: Correct number of movies found by superhero
Given I have the following movies
| Batman Begins | Batman |
| Wonder Woman | Wonder Woman |
| Wonder Woman 1984 | Wonder Woman |
When I search for movies by superhero Wonder Woman
Then I find 2 movies

Then you can get this data from the step definitions and use during yours tests.

You might have other ways to create and manage the test data, but whatever the approach you decide, make sure the scenarios are independent and if you can clean up the environment data after (unless you have decided to have static data in the environment for now) then clean it.

 

Resources:

https://forums.docker.com/t/seeding-data-volume-containers-mongodb/2214

https://stackoverflow.com/questions/31210973/how-do-i-seed-a-mongo-database-using-docker-compose

https://www.baeldung.com/cucumber-data-tables

https://docs.docker.com/storage/volumes/

https://phauer.com/2018/local-development-docker-compose-seeding-stubs/

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.