JSON Schemas are your True Testing Friend

July 10, 2018

By Gleb Bahmutov

We use JSON schemas to describe the data flowing through our system, document API routes, test server code and even validate fixtures used during end-to-end testing. This long(ish) blog post describes in detail how we did this.

Introduction

At Cypress we have 3 large pieces of software:

  • The Cypress Test Runner, an Electron desktop application.
  • The API which receives test results from the Test Runner when running cypress run --record
  • The Cypress Dashboard Service, a web application that shows test restults to users.

When we were planning version 3 of the Cypress Test Runner, we needed to make sure big changes to the way the Test Runner processes test runs did not break anything; these changes were especially important as we were preparing to introduce automatic test load balancing. Not only the Test Runner had to change, but the API had to introduce new features while supporting both existing versions of the Test Runner and the new v3. The changes also would affect the Cypress Dashboard.

In short - we had a huge amount of work on our hands; and this work had to be backwards compatible.

The solution we came up with to avoid accidentally breaking our clients (both the Test Runner and the Dashboard could be called our API’s clients) was easy to understand and use. It solved the version problem, made testing super simple and made self-documenting code a reality. This blog post describes our solution by going through an API and showing what we did at each step.

TLDR

In a nutshell: we wrote a collection of versioned JSON schemas that represent the domain objects in our system. We also wrote a library of tools called schema-tools to validate the data against these schemas. We validate everything using schemas: the data going back and forth between any client and the API; plus our test fixtures to ensure they are always a true representation of the real-world data.

Finally, we combined schema validation with automatic object sanitize methods that replace highly dynamic properties (think dates and uuids) with their “default” values, ensuring that during the test we can validate the object, sanitize it and then save the entire object as a test snapshot. This allows our tests to always deal with realistic data, while providing plenty of context to each test. Dealing with full data avoids the problem of some test inputs being partial objects, missing most properties that are present in the real usage. This is a problem nicely explained by Justin Searls @searls in his highly entertaining @assertjs talk Please don’t mock me.

Let’s build it

Let’s set up an example REST server with a single resource - a list of todos. Step by step I will describe how to make JSON schemas that describe requests and responses, and then how to use the schemas to validate items in the API. Then I will show how to validate test data, both in Node tests and in the end-to-end API tests that use the Cypress Test Runner and load test data from fixture files.

You can find the source code in the todo-api-with-json-schema repository on GitHub.

Initial setup

Install json-server and reset middleware.

npm install --save json-server-reset json-server

Let’s use TypeScript right away via ts-node. While TypeScript is completely optional, it helps a lot to avoid silly coding mistakes.

npm i -S ts-node typescript

Install schema tools from Cypress.

npm i -S @cypress/schema-tools

Create a db.json file with our todos.

{
  "todos": []
}

And start the server by defining a script in the package.json.

{
  "scripts": {
    "start": "json-server db.json --middlewares node_modules/json-server-reset"
  }
}

In one terminal window, start the server using npm start. This starts the local server on port 3000, and posts a new object; here I am using httpie instead of curl.

$ http POST :3000/todos text="do something" done:=false
HTTP/1.1 201 Created
{
    "done": false,
    "id": 3,
    "text": "do something"
}
$ http :3000/todos
HTTP/1.1 200 OK
[
  {
    "done": false,
    "id": 1,
    "text": "do something"
  }
]

And we can reset the database back to an empty list of items.

$ http POST :3000/reset todos:=[]
HTTP/1.1 200 OK
$ http :3000/todos
HTTP/1.1 200 OK
[]

Schemas

JSON Schema

JSON Schema is a project intended to describe and validate JSON documents. It has been around for a while and aims to eventually become a standard. At its core is a convention to describe data objects using an easy to read and valid format. Here is a typical example:

{
  "title": "Person",
  "type": "object",
  "properties": {
    "firstName": {
      "type": "string"
    },
    "lastName": {
      "type": "string"
    },
    "age": {
      "description": "Age in years",
      "type": "integer",
      "minimum": 0
    }
  },
  "required": ["firstName", "lastName"]
}

The schemas can be nested, contain lists, custom formats, etc. I think they are powerful enough to describe most of the cases used by your typical REST API.

Because JSON Schemas are really just JSON objects, there are tools to work with schemas in many languages, see the list. In the JavaScript environment, we have tried two libraries for validating objects: is-my-json-valid and jsen. Both were excellent, and we picked is-my-json-valid because I have already used it before.

Schema tools

We wrote additional code to make working with json-schemas easier and published the tools under @cypress/schema-tools on NPM. I will use @cypress/schema-tools to describe the shape of a typical TODO item - it has only two properties. You can find this schema in the example repo. At the moment I am using @cypress/[email protected].

To describe an object, we use type ObjectSchema with 3 required properties:

  • version - an object describing the semantic version of this schema to track changes. We will start with version 1.0.0.
  • schema - the JSON Schema itself describing the properties of a valid object.
  • example - this is extremely useful when showing error messages. Since we are programming in TypeScript the example has its own type, but this is optional. You can use schemas without any TypeScript code.
// schemas/post-todo-request.ts
import { ObjectSchema } from '@cypress/schema-tools'

/**
 * Todo item sent by the client.
 */
type PostTodoRequestExample100 = {
  text: string
  done: boolean
}

const postTodoExample100: PostTodoRequestExample100 = {
  text: 'do something',
  done: false,
}

// ObjectSchema describing a single POST TODO item request
const PostTodoRequest100: ObjectSchema = {
  version: {
    major: 1,
    minor: 0,
    patch: 0,
  },
  schema: {
    title: 'PostTodoRequest',
    type: 'object',
    description: 'Todo item sent by the client',
    properties: {
      text: {
        type: 'string',
        description: 'Todo text, like "clean room"',
      },
      done: {
        type: 'boolean',
        description: 'Is this todo item completed?',
      },
    },
    // require all properties
    required: true,
    // do not allow any extra properties
    additionalProperties: false,
  },
  example: postTodoExample100,
}

This is the only schema for now, but in the future the schema might change. Each schema should have its own version (hopefully following the semantic versioning principles), and we can store all versions in a collection. For now we have just a single version of POST Todo request, so let’s put it into a list PostTodoRequest.

// schemas/post-todo-request.ts
// PostTodoRequest100 as above
import { versionSchemas } from '@cypress/schema-tools'
export const PostTodoRequest = versionSchemas(PostTodoRequest100)

In a “normal” system, to support older clients, the PostTodoRequest list would be composed of multiple versions, all representing different shapes of the request.

// schemas/post-todo-request.ts
// v1.0.0 was our starting version
// v1.1.0 probably has added a property
// v2.0.0 has some breaking change - maybe a property has been removed or renamed
const PostTodoRequest = versionSchemas(
  PostTodoRequest100, PostTodoRequest110, PostTodoRequest200
)

Put all individual collections like PostTodoRequest into a single schemas object and export it - the object schemas is what every other function from @cypress/schema-tools will take as an argument.

// schemas/index.ts
import { SchemaCollection, combineSchemas } from '@cypress/schema-tools'
import { PostTodoRequest } from './post-todo-request'
export const schemas: SchemaCollection = combineSchemas(PostTodoRequest)

Later we will add PostTodoResponse to the schemas collection to describe the object returned from the API.

Validating an object

Now that we have the schemas collection, we can check if a given object follows the schema. In code this would look like this (I wrote a Jest test to show validation):

// __tests__/post-todo-request-test.ts
import { assertSchema } from '@cypress/schema-tools'
import { schemas } from '../schemas'

const assertTodoRequest = assertSchema(schemas)('postTodoRequest', '1.0.0')

test('valid TODO request object', () => {
  const todo = {
    text: 'use scheams',
    done: true,
  }
  expect(() => {
    assertTodoRequest(todo)
  }).not.toThrow()
})

We can write another test to see how the schema validates an object that does not follow the expected shape.

// __tests__/post-todo-request-test.ts
test('TODO request object missing text', () => {
  const todo = {
    done: true,
  }
  expect(() => {
    assertTodoRequest(todo)
  }).toThrowErrorMatchingSnapshot()
})

The saved snapshot shows how useful the example object included in the ObjectSchema is:

// __tests__/__snapshots__/post-todo-request-test.ts.snap
exports[`TODO request object missing text 1`] = `
"Schema [email protected] violated

Errors:
data.text is required

Current object:
{
  \\"done\\": true
}

Expected object like this:
{
  \\"done\\": false,
  \\"text\\": \\"do something\\"
}"
`;

Our goal was to describe really well what the actual and the expected objects were, and how they were different. Anyone looking to debug an error in the system should be able to point to the problem without searching through the tests or documentation on how a “correct” object was supposed to look like; this became a huge time saver for our team.

Schema API

In our repository we are always going to work with the same schema collection. Thus writing assertSchema(schemas)... quickly grows tiresome.

import { assertSchema } from '@cypress/schema-tools'
import { schemas } from '../schemas'
const assertTodoRequest = assertSchema(schemas)('postTodoRequest', '1.0.0')
//                        ^^^^^^^^^^^^^^^^^^^^^ again and again 😟

schema-tools provides a method to “bind” the schemas collection to assertSchema and other methods.

// schemas/index.ts
import { bind } from '@cypress/schema-tools'
export const api = bind({ schemas })
/*
  api has methods to validate, sanitize, etc. objects against "schemas"
  {
    assertSchema: [Function],
    schemaNames: [ 'postTodoRequest' ],
    getExample: [Function],
    sanitize: [Function],
    validate: [Function]
  }
*/

The exported object api has methods to validate, sanitize, and do other things against schemas collection. For example, we can get the example object for a particular schema

// __tests__/post-todo-request-test.ts
test('bind schemas', () => {
  const api = bind({ schemas })
  const todoRequestExample = api.getExample('postTodoRequest')('1.0.0')
  expect(todoRequestExample).toEqual({
    text: 'do something',
    done: false,
  })
})

And we can check if the included example actually passes the schema

// __tests__/post-todo-request-test.ts
test('bind schemas and assert an object', () => {
  const api = bind({ schemas })
  const schemaName = 'postTodoRequest'
  const schemaVersion = '1.0.0'
  const example = api.getExample(schemaName)(schemaVersion)
  const assertRequest = api.assertSchema(schemaName, schemaVersion)
  expect(() => {
    assertRequest(example)
  }).not.toThrow()
})

So now we have our schemas, and we have several utility methods provided by the schema-tools that are convenient to use. Let’s start taking advantage of them!

Validating API request and response

Writing middleware

We can start by writing a placeholder middleware to just print the new object being sent.

// schema-check.ts
const schemaCheck = (req, res, next) => {
  if (req.method === 'POST' && req.path === '/todos') {
    console.log('posting new TODO item')
    console.log(req.body)
  }
  next()
}
export = schemaCheck

Since I am using TypeScript, I need to run json-server via ts-node which changes my npm start command in the file package.json

{
  "scripts": {
    "start": "ts-node node_modules/.bin/json-server db.json --middlewares node_modules/json-server-reset --middlewares ./schema-check.ts"
  }
}

Start the server and observe the objects being printed

posting new TODO item
{ text: 'do something', done: false }
POST /todos 201 20.157 ms - 56

Now let’s describe the object we expect the client to post.

Validating the request

Now let’s add request validation to our middleware

// schema-check.ts
import { api } from './schemas'

const assertTodoRequest = api.assertSchema('PostTodoRequest', '1.0.0')

const schemaCheck = (req, res, next) => {
  if (req.method === 'POST' && req.path === '/todos') {
    console.log('posting new TODO item')
    console.log(req.body)
    try {
      assertTodoRequest(req.body)
    } catch (e) {
      console.error('new Todo request did not pass schema')
      console.error(e.message)
      return next(e)
    }
  }
  next()
}

export = schemaCheck

If we send a “good” object, everything works.

$ http POST :3000/todos text="do this" done:=false
HTTP/1.1 201 Created

But sending an invalid object returns an error (I am just using the default 500 HTTP status code. Something like 412 would be more appropriate in the real world).

http POST :3000/todos foo="bar"
HTTP/1.1 500 Internal Server Error

The server shows the error message - the schema-tools focuses on showing all relevant information to allow one to understand the exact reason for schema violation right away.

posting new TODO item
{ foo: 'bar' }
new Todo request did not pass schema
Schema [email protected] violated

Errors:
data.text is required
data.done is required
data has additional properties: foo

Current object:
{
  "foo": "bar"
}

Expected object like this:
{
  "done": false,
  "text": "do something"
}

The default error message has the error message with schema name and version, list of failed properties, the object being validated and the example object for this schema. You can also deconstruct these properties from the thrown error to form your own, more limited error message. This is very useful for larger schemas, when the error message is too verbose.

// schema-check.ts
try {
  assertTodoRequest(req.body)
} catch (e) {
  const {errors, example} = e
  console.error(errors)
  console.error('example object %j', example)
}

Then you can send just errors for example to the client, while sending the complete error information to the crash reporting service.

Validating the response

json-server returns an object that looks like this:

{
  "done": false,
  "id": 2,
  "text": "do this"
}

Our middleware should validate returned objects to make sure the server is working correctly and the client receives what it expects to receive. At this point we are going to add the PostTodoResponse 1.0.0 schema. To avoid confusion I advise you to put the schemas into a separate folder. Name schema files post-todo-request.ts and post-todo-response.ts to follow VERB-name-{request|response}.ts pattern. The response schema looks very much like a request, but has additional id property.

// schemas/post-todo-response.ts
import { ObjectSchema, versionSchemas } from '@cypress/schema-tools'

/**
 * Todo item saved by the server and returned to the client.
 */
type PostTodoResponseExample100 = {
  text: string
  done: boolean
  id: number
}

const postTodoResponseExample100: PostTodoResponseExample100 = {
  text: 'do something',
  done: false,
  id: 2,
}

const PostTodoResponse100: ObjectSchema = {
  version: {
    major: 1,
    minor: 0,
    patch: 0,
  },
  schema: {
    title: 'PostTodoResponse',
    type: 'object',
    description: 'Todo item saved by the server and returned to the client',
    properties: {
      text: {
        type: 'string',
        description: 'Todo text, like "clean room"',
      },
      done: {
        type: 'boolean',
        description: 'Is this todo item completed?',
      },
      id: {
        type: 'integer',
        minimum: 1,
        description: 'Item server id',
      },
    },
    // require all properties
    required: true,
    // do not allow any extra properties
    additionalProperties: false,
  },
  example: postTodoResponseExample100,
}

export const PostTodoResponse = versionSchemas(PostTodoResponse100)

Our schemas/index.ts just collects all individual schemas and puts them into a single object.

import { SchemaCollection, bind, combineSchemas } from '@cypress/schema-tools'
import { PostTodoRequest } from './post-todo-request'
import { PostTodoResponse } from './post-todo-response'

export const schemas: SchemaCollection = combineSchemas(
  PostTodoRequest,
  PostTodoResponse
)

export const api = bind({ schemas })

Now we can use the response schema to validate successful results from the server.

import { api } from './schemas'

const assertTodoRequest = api.assertSchema('PostTodoRequest', '1.0.0')
const assertTodoResponse = api.assertSchema('PostTodoResponse', '1.0.0')

const isSuccessful = res => res.statusCode === 200

const validateJsonResponse = res => {
  const resJson = res.jsonp.bind(res)
  res.jsonp = data => {
    // TODO: only check successful responses
    // otherwise we could be checking JSON error objects
    assertTodoResponse(data)
    return resJson(data)
  }
}

const schemaCheck = (req, res, next) => {
  if (req.method === 'POST' && req.path === '/todos') {
    console.log('posting new TODO item')
    console.log(req.body)
    try {
      assertTodoRequest(req.body)
    } catch (e) {
      console.error('new Todo request did not pass schema')
      console.error(e.message)
      return next(e)
    }

    validateJsonResponse(res)
  }
  next()
}

export = schemaCheck

We are validating both the request and the response, and can even set the response schema name and version as headers.

// schema-check.ts
const schemaNameHeader = 'x-schema-name'
const schemaVersionHeader = 'x-schema-version'
const validateJsonResponse = res => {
  const resJson = res.jsonp.bind(res)
  res.jsonp = data => {
    // TODO: only check successful responses
    // otherwise we could be checking JSON error objects
    assertTodoResponse(data)
    res.set(schemaNameHeader, 'PostTodoResponse')
    res.set(schemaVersionHeader, '1.0.0')
    return resJson(data)
  }
}

We can see the headers in the response:

$ http POST :3000/todos text="do this" done:=false
HTTP/1.1 201 Created
x-schema-name: PostTodoResponse
x-schema-version: 1.0.0

{
    "done": false,
    "id": 12,
    "text": "do this"
}

Seeing the schema name and version in the terminal or in the browser’s DevTools is very convenient.

Versioned routes

To better support different versions of the Cypress Test Runner already used, the Cypress API versions routes. For example, with each request, the client can set a custom header to pick a different code path. In the API’s router we can validate each route version using its own schema version. It looks something like this:

r.post('/todos', schemaCheck({
  '1': {
    req: '[email protected]',
    res: '[email protected]',
  },
  '2': {
    req: '[email protected]',
    res: '[email protected]',
  },
  '3': {
    req: '[email protected]',
    res: '[email protected]',
  }
}), postTodo)

The above router path says:

  • Clients sending a custom header of version ‘1’ to POST /todos will get the request validated using schema `[email protected]and the server will respond with an object matchingPostTodoResponse v1.0.0`.
  • Clients using version ‘2’ of the same end point POST /todos will get their request body validated against schema PostTodoRequest v1.1.0; the response from the server will be matching schema PostTodoResponse v1.1.0
  • Finally, newest clients that use route version ‘3’ will have their data validated using versions 2.0.0 of the schemas.

Having route versions and schema versions ensures that we can modify the server’s behavior (route version) and the inputs and outputs it expects (schema versions) in a flexible and independent way. As a rule:

  • If the request and response schemas stay the same, but the server code changes, we increment route version and support both code paths. The client picks the server behavior using x-route-version header.
  • If the response changes, we create a new response schema in backwards compatible way. Thus clients who expect a certain shape can use schema-tools.trim to remove all extra data if necessary.

To avoid breaking existing clients, we have all schema validations enabled during testing and local development. We also validate requests and responses in the staging environment before going to production. In production we skip the request validation to avoid accidentally breaking older clients; but we still validate the responses to make sure the server is sending the data in the format the client is capable of receiving. If the schema is violated, a very detailed error is sent to our crash reporting service.

Schema documentation

schema-tools includes a utility method to generate Markdown documentation from a schema collection.

// document.ts
// generates schema documentation
import { documentSchemas } from '@cypress/schema-tools'
import { schemas } from './schemas'
console.log(documentSchemas(schemas))

There is a document script in our file package.json that redirects the output to a file.

{
  "scripts": {
    "document": "ts-node document > schemas.md"
  }
}

The created schemas.md has become a “goto” place to look up object information in our system.

Using schemas for testing

Let us start the server and test adding todos using our schemas. To start the server and run the tests I recommend using the start-server-and-test utility. Switch npm test to point at the start-server-and-test command and move Jest tests to npm run unit command.

$ npm install --save-dev start-server-and-test
{
  "scripts": {
    "test": "start-server-and-test :3000 unit",
    "start": "ts-node ...",
    "unit": "jest"
  }
}

Notice that because we use the default npm start command to start the server, we only specify the port and the unit script name in the "test": "start-server-and-test :3000 unit" command.

Here is a typical API test that uses the got utility to perform API calls. Before each test we reset the JSON database with an empty list of todos.

// __tests__/server-test.ts
import * as got from 'got'
import { api } from '../schemas'

const baseUrl = 'http://localhost:3000'
const todosUrl = `${baseUrl}/todos`

beforeEach(function resetState() {
  const resetUrl = `${baseUrl}/reset`
  return got(resetUrl, {
    method: 'POST',
    json: true,
    body: {
      todos: [],
    },
  })
})

test('returns empty list of todos', async () => {
  const response = await got(todosUrl, { json: true })
  expect(response.body).toEqual([])
})

test('adds example todo', async () => {
  const example = api.getExample('postTodoRequest')('1.0.0')
  const response = await got(todosUrl, {
    method: 'POST',
    json: true,
    body: example,
  })
  expect(response.body).toEqual({
    ...example,
    id: 1,
  })
  expect(response.headers['x-schema-name']).toBe('PostTodoResponse')
  expect(response.headers['x-schema-version']).toBe('1.0.0')
})

Notice how we can use the example provided by the schema PostTodoResponse v1.0.0 - we submit it in the call to POST /todos. We don’t have to guess or construct the data to be submitted; which comes in handy for more complex cases.

In the test “adds example todo”, we receive a response from the server and validate both the data and the response headers.

Schemas with dynamic data

Our Todo item is pretty simple. In the real world objects are a lot more complex. For example, objects often have highly random data, like GUIDs. Let us see how we can support GUIDs in our schemas without making our testing code any more complex.

Middleware that adds a GUID

First, I will add a middleware to our server that adds the uuid/v4 property to each new item posted.

// ./add-guid.ts
import * as uuid from 'uuid/v4'

const addGuid = (req, res, next) => {
  if (req.method === 'POST') {
    if (!req.body.uuid) {
      req.body.uuid = uuid()
    }
  }
  next()
}

export = addGuid

We are setting UUID on the new object, only if there is not one set already.
Our start script puts this addGuid middleware before the schema check middleware in the package.json “start” script.

{
  "scripts": {
    "start": "ts-node node_modules/.bin/json-server db.json --middlewares node_modules/json-server-reset --middlewares ./add-guid.ts --middlewares ./schema-check.ts",
  }
}

When we start the server and try to post a new todo item, we get schema validation error.

posting new TODO item
{ text: 'do this',
  done: false,
  uuid: '5022ffb5-b3cc-4b1a-a21c-749a0cfd110f' }
new Todo request did not pass schema
Schema [email protected] violated

Errors:
data has additional properties: uuid

Current object:
{
  "done": false,
  "text": "do this",
  "uuid": "5022ffb5-b3cc-4b1a-a21c-749a0cfd110f"
}

Expected object like this:
{
  "done": false,
  "text": "do something"
}

How do we validate the uuid property?

Custom formats

First, create a folder called formats with a index.ts file - this is a place for all custom string formats used by our schemas. For now, there will be only one custom format named uuid.

// formats/index.ts
import { CustomFormat, CustomFormats } from '@cypress/schema-tools'

const uuid: CustomFormat = {
  name: 'uuid',
  description: 'GUID used through the system',
  detect: /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/,
  defaultValue: 'ffffffff-ffff-ffff-ffff-ffffffffffff',
}

export const formats: CustomFormats = {
  uuid,
}

Add a property to the POST Todo Request schema - both the example and the schema property. For simplicity I will only show the added lines.

// schemas/post-todo-request.ts
import { formats } from '../formats'
// just for clarity
type uuid = string
type PostTodoRequestExample100 = {
  // other properties ...
  uuid: uuid
}

const postTodoExample100: PostTodoRequestExample100 = {
  // insert some random GUID
  uuid: '20514af9-2a2a-4712-9c1e-0510c288c9ec',
}

const PostTodoRequest100: ObjectSchema = {
  // same version
  // example
  schema: {
    // existing properties
    uuid: {
      type: 'string',
      format: formats.uuid.name, // "uuid"
      description: 'a random GUID',
    },
  }
}

We also need to add the same property to the post-todo-response schema. Now our schemas require the uuid property that matches the uuid.detect regular expression. All our tests are broken! Luckily we can easily update the tests by adding a random uuid and updating our snapshots.

Testing dynamic data

Except our tests have to do all sorts of hacks to compare dynamic fields (like uuid). For example when comparing the returned object to the example, we either delete or overwrite this property using the value from the example object before comparing objects.

it('adds example todo', async () => {
  const example = api.getExample('postTodoRequest')('1.0.0')
  const response = await got(todosUrl, {
    method: 'POST',
    json: true,
    body: example,
  })
  // HACK to match dynamic uuid
  response.body.uuid = example.uuid
  expect(response.body).toEqual({
    ...example,
    id: 1,
  })
  expect(response.headers['x-schema-name']).toBe('PostTodoResponse')
  expect(response.headers['x-schema-version']).toBe('1.0.0')
})

Instead of modifying the response object in each test, we can use the schema to consistently sanitize the data using the schema. Since we have provided the default value in the uuid format, we marked it as “dynamic”. Thus we can use the sanitize method provided by schema-tools to walk the given object and change any actual uuid property to uuid.default value.

it('sanitizes example object', () => {
  const todo = {
    text: 'my text',
    done: false,
    uuid: '13d46b9e-932f-4265-a4aa-1ee80e2c88d6',
  }
  const sanitized = api.sanitize('postTodoRequest', '1.0.0')(todo)
  expect(sanitized).toEqual({
    text: 'my text',
    done: false,
    // default value should be "ffffffff-ffff-ffff-ffff-ffffffffffff"
    uuid: formats.uuid.defaultValue,
  })
})

We can use the sanitize method to greatly simplify our API testing - just assert that the received object follows the schema, then sanitize it, and then save it as a snapshot.

it('adds todo', async () => {
  const response = await got(todosUrl, {
    method: 'POST',
    json: true,
    body: {
      text: 'sanitize using schema',
      done: false,
    },
  })
  const schemaName = 'PostTodoResponse'
  const schemaVersion = '1.0.0'
  api.assertSchema(schemaName, schemaVersion)(response.body)
  expect(
    api.sanitize(schemaName, schemaVersion)(response.body)
  ).toMatchSnapshot()
})

See the pattern? We receive a response object, then we check it against the schema. If there is no exception, we sanitize the response using the schema, which replaces dynamic properties, like uuid using custom format object. The result should be deterministic - it is the data we have sent during the test, plus the default value for uuid property. Now we can snapshot it and save.

// __tests__/__snapshots__/server-test.ts.snap
exports[`server api adds todo 1`] = `
Object {
  "done": false,
  "id": 1,
  "text": "sanitize using schema",
  "uuid": "ffffffff-ffff-ffff-ffff-ffffffffffff",
}
`;

Nice, even complex schemas can be recursively sanitized using the schema, producing data that accurately shows even the largest server responses; and this data can be saved as a snapshot, giving you the complete view of the data traveling through the test. An entire test in 3 lines of code!

Note Jest snapshots include an ability to define custom matchers for dynamic data. For example we could check that the uuid property is a string like this:

expect(response.body).toMatchSnapshot({
  uuid: expect.any(String)
})

I think our approach using json-schemas is preferable for two reasons:

  1. There is no duplication, all objects are sanitized inside api.sanitize using the original format rather than at every call of toMatchSnapshot.
  2. The dynamic formats such as UUID do not depend on the testing framework and do not require any additional code to be added to the Jest expect.any method.

Moving on. While we are adding formats, we should also document them - just pass them to the documentSchemas function.

// document.ts
// generates schema documentation
import { documentSchemas } from '@cypress/schema-tools'
import { formats } from './formats'
import { schemas } from './schemas'
console.log(documentSchemas(schemas, formats))

Here is the bottom part of the generated Markdown file schemas.md rendered on GitHub. The custom format uuid links to the formats table at the bottom of the file.

Schemas for Cypress tests

Using got is ok for simple tests, but for more complex scenarios, or for combining API and web application tests, I strongly recommend Cypress. Let us install Cypress and test our API and observe what is being sent.

Install Cypress as a development dependency:

$ npm i -D cypress

Add a command to open Cypress in the interactive mode (cypress open) and non-interactive mode for CI (cypress run). Also add the “start-server-and-test” command to start the server before running the tests in the non-interactive mode.

{
  "scripts": {
    "cy:open": "cypress open",
    "cy:run": "cypress run",
    "cy:test": "start-server-and-test :3000 cy:run"
  }
}

Our first test sends a TODO item (with GUID), and it looks almost exactly like the test we have written before. We reset the state with POST /reset before each test, send an object using cy.request(), and expect the result to equal a hardcoded value.

// cypress/integration/api-spec.js
/// 
import uuid from 'uuid/v4'

describe('Todo API', () => {
  const baseUrl = 'http://localhost:3000'
  const todosUrl = `${baseUrl}/todos`

  beforeEach(function resetState () {
    const resetUrl = `${baseUrl}/reset`
    cy.request({
      method: 'POST',
      url: resetUrl,
      body: {
        todos: []
      }
    })
  })

  it('adds TODO', () => {
    const todo = {
      text: 'use Cypress',
      done: true,
      uuid: uuid()
    }

    cy
      .request({
        method: 'POST',
        url: todosUrl,
        body: todo
      })
      .its('body')
      .should(
        'deep.equal',
        // es8 ... spread operator not transpiled
        // by Cypress without extra configuration
        // so just use "Object.assign" to merge extra property
        Object.assign({}, todo, {
          id: 1
        })
      )
  })
})

If you start Cypress with npm run cy:open and run this spec file, you will see each cy.request() command in the command log. You can click on each command and see additional information in the DevTools console window. For example, the .should('deep.equal', ...) command dumps the received response.body and expected objects into the console. We can see that the assertion really passes because the objects have equal property values.

Tip: move baseUrl to file cypress.json to avoid hard coding long urls in the test files.

{
  "baseUrl": "http://localhost:3000"
}

Now every cy.request() can just specify the endpoint

beforeEach(function resetState () {
  const resetUrl = '/reset'
  cy.request({
    method: 'POST',
    url: resetUrl,
    body: {
      todos: []
    }
  })
})

Loading data from a fixture

Hardcoding test data in the test soon becomes too verbose. Cypress supports fixtures that we can use to store items to be sent. Make a new file cypress/fixtures/todo.json and add a typical item there.

{
  "text": "use fixtures",
  "done": true,
  "uuid": "4a3a8af7-27ff-437b-97e4-1d8738e44191"
}

Now load this object and send it to the server during the test. In this test we also confirm that the server correctly sets the custom schema response headers.

const todosUrl = '/todos'

it('adds TODO from fixture', () => {
  cy.fixture('todo').then(todo => {
    cy
      .request({
        method: 'POST',
        url: todosUrl,
        body: todo
      })
      .its('headers')
      .should('include', {
        'x-schema-name': 'PostTodoResponse',
        'x-schema-version': '1.0.0'
      })
  })
})

Fixtures should match schemas

But after a while our fixture data might get out of sync with the request schema. Luckily we can add a test to validate the fixture against the PostTodoRequest v1.0.0 schema using methods provided by our schemas.api function.

In our Cypress spec file, load schemas and the fixture object and assert the schema (this is why the assertSchema function is curried, by the way).

// cypress/integration/api-spec.js
import { api } from '../../schemas'

it('has todo fixture matching schema', () => {
  cy.fixture('todo').then(api.assertSchema('PostTodoRequest', '1.0.0'))
})

But Cypress cannot run this test - it shows an error trying to load the schemas.ts file.

We have written our schemas using TypeScript (good), but Cypress does not transpile .ts files by default (because there are many ways this can be configured). We can transpile our schemas code into JavaScript before running Cypress tests. Just add options to the tsconfig.json to output code…

{
  "compilerOptions": {
    "target": "es5",
    "module": "commonjs",
    "moduleResolution": "node",
    "lib": ["es2015", "es2016", "dom"],
    "outDir": "./dist",
    "skipLibCheck": true,
    "pretty": true
  },
  "include": ["./schemas/**/*"]
}

…and a “build” command to package.json.

{
  "scripts": {
    "build": "tsc"
  }
}

Now our test will run, and if we accidentally modify fixture, or if the schema changes, Cypress will raise an exception that will be very simple to debug.

Not only can we validate the fixtures easily, we can confirm that the returned object really matches the expected schema. A test looks like this:

// cypress/integration/api-spec.js
it('returns new TODO item matching schema', () => {
  cy.fixture('todo').then(todo => {
    cy
      .request({
        method: 'POST',
        url: todosUrl,
        body: todo
      })
      .its('body')
      .then(api.assertSchema('PostTodoResponse', '1.0.0'))
  })
})

Our schemas act like convenient “checkpoints” for data going to the server and coming back to the client. We can even validate fixture data, send it to the server and validate the response all using a chain of point-free callbacks. I have added comments for clarity.

import { merge } from 'ramda'
import { api } from '../../dist/schemas'

it('can pass asserted todo', () => {
  const nameIt = name => value => ({ [name]: value })

  cy
    .fixture('todo')
    .then(api.assertSchema('PostTodoRequest', '1.0.0')) // validates fixture data
    .then(nameIt('body')) // transforms fixture 'data' into {body: data}
    .then(
      merge({
        // forms cy.request options object
        // these fields + {body: data}
        method: 'POST',
        url: todosUrl
      })
    )
    .then(cy.request) // calls cy.request(options)
    .its('body') // grabs response.body
    .then(api.assertSchema('PostTodoResponse', '1.0.0')) // validates response schema
})

Our experience

We first applied JSON schemas when doing our Cypress v2 to v3 refactoring. This was a very large change - the Test Runner switched how it executes individual tests, the API was refactored to prepare for parallel test running, and the Dashboard switched to displaying test runs differently. There were new routes, new versions of old routes and the shapes of data objects sent across the wire from the Test Runner to the API, and from the API to the Dashboard. Three different developers worked on this major release; each was in charge of their own component: the Test Runner, the API and Dashboard.

With this all said, the development and release went as smoothly as one could dream. Because the tests used the same JSON schemas, as were enforced by the API middleware, the deployed software “just worked”. In fact, not only did we not have to issue any patches or deploy hot fixes, we have even discovered a bug in our database model that did not enforce a property on new instances, leaving them blank. The bug was discovered in the staging environment, when the incomplete data sent from the Dashboard triggered a JSON schema validation errors. When we investigated this error, we went to look at our production database, and found that some records did not have the column value set; so we patched the column with the default value and closed the code hole.

One practical advice from our work: we have published the JSON schemas in our system as an NPM package so we could use the same schemas in the 3 different components.

Conclusions

  • JSON schemas are an incredibly useful tool for documenting the domain objects flowing through your system.
  • They are not limited to the REST API calls; but can validate any object at any communication point.
  • We have used schemas to document, validate and test an Electron application, API server and web application with great success.

FAQ

Why not use GraphQL?

GraphQL is a model and query language description language. While we are exploring GraphQL to replace or supplement our REST APIs, it is not as powerful (yet) as json-schemas are in describing our domain objects.

How is this different from Swagger / RAML?

Swagger is an excellent specification format for describing HTTP server API, not as a generic domain description that we wanted.

So how server and the client stay in sync?

We publish all schemas as a private NPM package. The server and the clients can install specific version of the package, and if these versions are the same or compatible, the server and the client will be able to communicate without throwing errors.