npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@btabram/graphql-deduplicator

v2.1.1

Published

A GraphQL response deduplicator. Removes duplicate entities from the GraphQL response.

Downloads

104

Readme

graphql-deduplicator

GitSpo Mentions Travis build status Coveralls NPM version Canonical Code Style

A GraphQL response deduplicator.

Removes duplicate entities from the GraphQL response.

Client support

graphql-deduplicator works with any GraphQL client that appends __typename and id fields to every resource. If your client automatically does not request __typename and id fields, these fields can be specified in your GraphQL query.

graphql-deduplicator has been tested with apollo-client.

How does it work?

__typename and an id values are used to construct a resource identifier. The resource identifier is used to normalize data. As a result, when GraphQL API response contains a resource with a repeating identifier, the apollo-client is going to read only the first instance of the resource and ignore duplicate entities. graphql-deduplicator strips body (fields other than __datatype and id) from all the duplicate entities.

Motivation

graphql-deduplicator is designed to reduce the GraphQL response size by removing body of duplicate entities. This allows to make queries that return large datasets of repeated data without worrying about the cost of the response body size, time it takes to parse the response or the memory the reconstructed object will consume.

Real-life example

Consider the following schema:

interface Node {
  id: ID!
}

type Movie implements Node {
  id: ID!
  name: String!
  synopsis: String!
}

type Event implements Node {
  id: ID!
  movie: Movie!
  date: String!
  time: String!
}

type Query {
  events (
    date: String
  ): [Event!]!
}

Using this schema, you can query events for a particular date, e.g.

{
  events (date: "2017-05-19") {
    __typename
    id
    date
    time
    movie {
      __typename
      id
      name
      synopsis
    }
  }
}

Note: If you are using apollo-client, then you do not need to include __typename when constructing the query.

The result of the above query will contain a lot of duplicate information.

{
  "data": {
    "events": [
      {
        "__typename": "Event",
        "id": "1669971",
        "date": "2017-05-19",
        "time": "17:25",
        "movie": {
          "__typename": "Movie",
          "id": "1198359",
          "name": "King Arthur: Legend of the Sword",
          "synopsis": "When the child Arthur’s father is murdered, Vortigern, Arthur’s uncle, seizes the crown. Robbed of his birthright and with no idea who he truly is, Arthur comes up the hard way in the back alleys of the city. But once he pulls the sword Excalibur from the stone, his life is turned upside down and he is forced to acknowledge his true legacy... whether he likes it or not."
        }
      },
      {
        "__typename": "Event",
        "id": "1669972",
        "date": "2017-05-19",
        "time": "20:30",
        "movie": {
          "__typename": "Movie",
          "id": "1198359",
          "name": "King Arthur: Legend of the Sword",
          "synopsis": "When the child Arthur’s father is murdered, Vortigern, Arthur’s uncle, seizes the crown. Robbed of his birthright and with no idea who he truly is, Arthur comes up the hard way in the back alleys of the city. But once he pulls the sword Excalibur from the stone, his life is turned upside down and he is forced to acknowledge his true legacy... whether he likes it or not."
        }
      },
      // ...
    ]
  }
}

I've run into this situation when building https://applaudience.co.uk. A query retrieving 300 events produced a response of 1.5MB. When gziped, that number dropped to 100KB. However, the problem is that upon receiving the response, the browser needs to parse the entire JSON document. Parsing 1.5MB JSON string is (a) time consuming and (b) memory expensive.

The good news is that we do not need to return body of duplicate records (see How does it work?). For all duplicate records we only need to return __typename and id. This information is enough for apollo-client to identify the resource as duplicate and skip it. In case when a response includes large and often repeated fragments, this will reduce the response size 10x, 100x or more times.

In case of the earlier example, the response becomes:

{
  "data": {
    "events": [
      {
        "__typename": "Event",
        "id": "1669971",
        "date": "2017-05-19",
        "time": "17:25",
        "movie": {
          "__typename": "Movie",
          "id": "1198359",
          "name": "King Arthur: Legend of the Sword",
          "synopsis": "When the child Arthur’s father is murdered, Vortigern, Arthur’s uncle, seizes the crown. Robbed of his birthright and with no idea who he truly is, Arthur comes up the hard way in the back alleys of the city. But once he pulls the sword Excalibur from the stone, his life is turned upside down and he is forced to acknowledge his true legacy... whether he likes it or not."
        }
      },
      {
        "__typename": "Event",
        "id": "1669972",
        "date": "2017-05-19",
        "time": "20:30",
        "movie": {
          "__typename": "Movie",
          "id": "1198359"
        }
      },
      // ...
    ]
  }
}

The synopsis and name fields have been removed from the duplicate Movie entity.

Usage

Server-side

You need to format the final result of the query. If you are using graphql-server, configure formatResponse, e.g.

import express from 'express';
import {
  graphqlExpress
} from 'graphql-server-express';
import {
  deflate
} from 'graphql-deduplicator';

const app = express();

app.use('/graphql', graphqlExpress(() => {
  return {
    formatResponse: (response) => {
      if (response.data && !response.data.__schema) {
        return deflate(response);
      }

      return response;
    }
  };
}));

app.listen(3000);

Client-side

Example usage with apollo-client

You need to modify the server response before it is processed by the GraphQL client. If you are using apollo-client, use link configuration to setup an afterware, e.g.

// @flow

import {
  ApolloClient
} from 'apollo-client';
import {
  ApolloLink,
  concat
} from 'apollo-link';
import {
  InMemoryCache
} from 'apollo-cache-inmemory';
import {
  HttpLink
} from 'apollo-link-http';
import {
  inflate
} from 'graphql-deduplicator';

const httpLink = new HttpLink({
  credentials: 'include',
  uri: '/api'
});

const inflateLink = new ApolloLink((operation, forward) => {
  return forward(operation)
    .map((response) => {
      return inflate(response);
    });
});

const apolloClient = new ApolloClient({
  cache: new InMemoryCache(),
  link: concat(inflateLink, httpLink)
});

export default apolloClient;

Example usage with apollo-boost

It is not possible to configure link with apollo-boost. Therefore, it is not possible to use graphql-deduplicator with apollo-boost. Use apollo-client setup.

Note: apollo-boost will be discontinued starting Apollo Client v3.

Best practices

Enable compression conditionally

Do not break integration of the standard GraphQL clients that are unaware of the graphql-deduplicator.

Use deflate only when client requests to use graphql-deduplicator, e.g.

// Server-side

app.use('/graphql', graphqlExpress((request) => {
  return {
    formatResponse: (response) => {
      if (request.query.deduplicate && response.data && !response.data.__schema) {
        return deflate(response);
      }

      return response;
    }
  };
}));
// Client-side

const httpLink = new HttpLink({
  credentials: 'include',
  uri: '/api?deduplicate=1'
});

Example using with Apollo Server

import { GraphQLExtension, GraphQLResponse } from 'apollo-server-core'
import { deflate } from 'graphql-deduplicator'
// [..]

const createContext = ({ req }): => {
  return {
    req,
    // [..]
  }
}
class DeduplicateResponseExtension extends GraphQLExtension {
  public willSendResponse(o) {
    const { context, graphqlResponse } = o
    // Ensures `?deduplicate=1` is used in the request
    if (context.req.query.deduplicate && graphqlResponse.data && !graphqlResponse.data.__schema) {
      const data = deflate(graphqlResponse.data)
      return {
        ...o,
        graphqlResponse: {
          ...graphqlResponse,
          data,
        },
      }
    }

    return o
  }
}

const apolloServer = new ApolloServer({
  // [..]
  context: createContext,
  extensions: [() => new DeduplicateResponseExtension()],
})