npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

jscpd

v4.0.5

Published

detector of copy/paste in files

Downloads

306,561

Readme

jscpd

npm jscpd license npm

codecov FOSSA Status Backers on Open Collective Sponsors on Open Collective

NPM

Copy/paste detector for programming source code, supports 150+ formats.

Copy/paste is a common technical debt on a lot of projects. The jscpd gives the ability to find duplicated blocks implemented on more than 150 programming languages and digital formats of documents. The jscpd tool implements Rabin-Karp algorithm for searching duplications.

Table of content

Features

  • Detect duplications in programming source code, use semantic of programing languages, can skip comments, empty lines etc.
  • Detect duplications in embedded blocks of code, like <script> or <style> sections in html
  • Blame authors of duplications
  • Generate XML report in pmd-cpd format, JSON report, HTML report
  • Integrate with CI systems, use thresholds for level of duplications

Getting started

Installation

$ npm install -g jscpd

Usage

$ npx jscpd /path/to/source

or

$ jscpd /path/to/code

or

$ jscpd --pattern "src/**/*.js"

Options

Pattern

Glob pattern for find files to detect

  • Cli options: --pattern, -p
  • Type: string
  • Default: "**/*"

Example:

$ jscpd --pattern "**/*.js"

Min Tokens

Minimal block size of code in tokens. The block of code less than min-tokens will be skipped.

  • Cli options: --min-tokens, -k
  • Type: number
  • Default: 50

This option is called minTokens in the config file.

Min Lines

Minimal block size of code in lines. The block of code less than min-lines will be skipped.

  • Cli options: --min-lines, -l
  • Type: number
  • Default: 5

Max Lines

Maximum file size in lines. The file bigger than max-lines will be skipped.

  • Cli options: --max-lines, -x
  • Type: number
  • Default: 1000

Max Size

Maximum file size in bytes. The file bigger than max-size will be skipped.

  • Cli options: --max-size, -z
  • Type: string
  • Default: 100kb

Threshold

The threshold for duplication level, check if current level of duplications bigger than threshold jscpd exit with error.

  • Cli options: --threshold, -t
  • Type: number
  • Default: null

Config

The path to configuration file. The config should be in json format. Supported options in config file can be the same with cli options.

  • Cli options: --config, -c
  • Type: path
  • Default: null

Ignore

The option with glob patterns to ignore from analyze. For multiple globs you can use comma as separator. Example:

$ jscpd --ignore "**/*.min.js,**/*.map" /path/to/files
  • Cli options: --ignore, -i
  • Type: string
  • Default: null

Reporters

The list of reporters. Reporters use for output information of clones and duplication process.

Available reporters:

  • console - report about clones to console;
  • consoleFull - report about clones to console with blocks of code;
  • json - output jscpd-report.json file with clones report in json format;
  • xml - output jscpd-report.xml file with clones report in xml format;
  • csv - output jscpd-report.csv file with clones report in csv format;
  • markdown - output jscpd-report.md file with clones report in markdown format;
  • html - generate html report to html/ folder;
  • sarif - generate a report in SARIF format (https://github.com/oasis-tcs/sarif-spec), save it to jscpd-sarif.json file;
  • verbose - output a lot of debug information to console;

Note: A reporter can be developed manually, see @jscpd/finder package.

  • Cli options: --reporters, -r
  • Type: string
  • Default: console

Output

The path to directory for reports. JSON and XML reports will be saved there.

  • Cli options: --output, -o
  • Type: path
  • Default: ./report/

Mode

The mode of detection quality.

  • strict - use all types of symbols as token, skip only blocks marked as ignored.
  • mild - skip blocks marked as ignored and new lines and empty symbols.
  • weak - skip blocks marked as ignored and new lines and empty symbols and comments.

Note: A mode can be developed manually, see API section.

  • Cli options: --mode, -m
  • Type: string
  • Default: mild

Format

The list of formats to detect for duplications. Available over 150 formats.

Example:

$ jscpd --format "php,javascript,markup,css" /path/to/files
  • Cli options: --format, -f
  • Type: string
  • Default: {all formats}

Blame

Get information about authors and dates of duplicated blocks from git.

  • Cli options: --blame, -b
  • Type: boolean
  • Default: false

Silent

Don't write a lot of information to a console.

Example:

$ jscpd /path/to/source --silent
Duplications detection: Found 60 exact clones with 3414(46.81%) duplicated lines in 100 (31 formats) files.
Execution Time: 1381.759ms
  • Cli options: --silent, -s
  • Type: boolean
  • Default: false

Absolute

Use the absolute path in reports.

  • Cli options: --absolute, -a
  • Type: boolean
  • Default: false

Ignore Case

Ignore case of symbols in code (experimental).

  • Cli options: --ignoreCase
  • Type: boolean
  • Default: false

No Symlinks

Do not follow symlinks.

  • Cli options: --noSymlinks, -n
  • Type: boolean
  • Default: false

Skip Local

Use for detect duplications in different folders only. For correct usage of --skipLocal option you should provide list of path's with more than one item.

Example:

jscpd --skipLocal /path/to/folder1/ /path/to/folder2/

will detect clones in separate folders only, clones from same folder will be skipped.

  • Cli options: --skipLocal
  • Type: boolean
  • Default: false

Formats Extensions

Define the list of formats with file extensions. Available over 150 formats.

In following example jscpd will analyze files *.es and *.es6 as javascript and *.dt files as dart:

$ jscpd --formats-exts javascript:es,es6;dart:dt /path/to/code

Note: formats defined in the option redefine default configuration, you should define all need formats manually or create two configuration for run jscpd

  • Cli options: --formats-exts
  • Type: string
  • Default: null

Store

Stores used for collect information about code, by default all information collect in memory.

Available stores:

  • leveldb - leveldb store all data to files. The store recommended as store for big repositories. Should install @jscpd/leveldb-store before;

Note: A store can be developed manually, see @jscpd/finder package and @jscpd/leveldb-store as example.

  • Cli options: --store
  • Type: string
  • Default: null

Ignore Pattern

Ignore code blocks matching the regexp patterns.

  • Cli options: --ignore-pattern
  • Type: string
  • Default: null

Example:

$ jscpd /path/to/source --ignore-pattern "import.*from\s*'.*'"

Excludes import statements from the calculation.

Config File

Put .jscpd.json file in the root of the projects:

{
  "threshold": 0,
  "reporters": ["html", "console", "badge"],
  "ignore": ["**/__snapshots__/**"],
  "absolute": true
}

Also you can use section in package.json:

{
  ...
  "jscpd": {
    "threshold": 0.1,
    "reporters": ["html", "console", "badge"],
    "ignore": ["**/__snapshots__/**"],
    "absolute": true,
    "gitignore": true
  }
  ...
}

Exit code

By default, the tool exits with code 0 even code duplications were detected. This behaviour can be changed by specifying a custom exit code for error states.

Example:

jscpd --exitCode 1 .
  • Cli options: --exitCode
  • Type: number
  • Default: 0

Ignored Blocks

Mark blocks in code as ignored:

/* jscpd:ignore-start */
import lodash from 'lodash';
import React from 'react';
import {User} from './models';
import {UserService} from './services';
/* jscpd:ignore-end */
<!--
// jscpd:ignore-start
-->
<meta data-react-helmet="true" name="theme-color" content="#cb3837"/>
<link data-react-helmet="true" rel="stylesheet" href="https://static.npmjs.com/103af5b8a2b3c971cba419755f3a67bc.css"/>
<link data-react-helmet="true" rel="stylesheet" href="https://static.npmjs.com/cms/flatpages.css"/>
<link data-react-helmet="true" rel="apple-touch-icon" sizes="120x120" href="https://static.npmjs.com/58a19602036db1daee0d7863c94673a4.png"/>
<link data-react-helmet="true" rel="apple-touch-icon" sizes="144x144" href="https://static.npmjs.com/7a7ffabbd910fc60161bc04f2cee4160.png"/>
<link data-react-helmet="true" rel="apple-touch-icon" sizes="152x152" href="https://static.npmjs.com/34110fd7686e2c90a487ca98e7336e99.png"/>
<link data-react-helmet="true" rel="apple-touch-icon" sizes="180x180" href="https://static.npmjs.com/3dc95981de4241b35cd55fe126ab6b2c.png"/>
<link data-react-helmet="true" rel="icon" type="image/png" href="https://static.npmjs.com/b0f1a8318363185cc2ea6a40ac23eeb2.png" sizes="32x32"/>
<!--
// jscpd:ignore-end
-->

Reporters

HTML

Demo report

Badge

jscpd

More info jscpd-badge-reporter

PMD CPD XML

<?xml version="1.0" encoding="utf-8"?>
<pmd-cpd>
  <duplication lines="10">
      <file path="/path/to/file" line="1">
        <codefragment><![CDATA[ ...first code fragment... ]]></codefragment>
      </file>
      <file path="/path/to/file" line="5">
        <codefragment><![CDATA[ ...second code fragment...}]]></codefragment>
      </file>
      <codefragment><![CDATA[ ...duplicated fragment... ]]></codefragment>
  </duplication>
</pmd-cpd>

JSON reporters

{
  "duplicates": [{
      "format": "javascript",
      "lines": 27,
      "fragment": "...code fragment... ",
      "tokens": 0,
      "firstFile": {
        "name": "tests/fixtures/javascript/file2.js",
        "start": 1,
        "end": 27,
        "startLoc": {
          "line": 1,
          "column": 1
        },
        "endLoc": {
          "line": 27,
          "column": 2
        }
      },
      "secondFile": {
        "name": "tests/fixtures/javascript/file1.js",
        "start": 1,
        "end": 24,
        "startLoc": {
          "line": 1,
          "column": 1
        },
        "endLoc": {
          "line": 24,
          "column": 2
        }
      }
  }],
  "statistic": {
    "detectionDate": "2018-11-09T15:32:02.397Z",
    "formats": {
      "javascript": {
        "sources": {
          "/path/to/file": {
            "lines": 24,
            "sources": 1,
            "clones": 1,
            "duplicatedLines": 26,
            "percentage": 45.33,
            "newDuplicatedLines": 0,
            "newClones": 0
          }
        },
        "total": {
          "lines": 297,
          "sources": 1,
          "clones": 1,
          "duplicatedLines": 26,
          "percentage": 45.33,
          "newDuplicatedLines": 0,
          "newClones": 0
        }
      }
    },
    "total": {
      "lines": 297,
      "sources": 6,
      "clones": 5,
      "duplicatedLines": 26,
      "percentage": 45.33,
      "newDuplicatedLines": 0,
      "newClones": 0
    }
  }
}

API

For integration copy/paste detection to your application you can use programming API:

jscpd Promise API

import {IClone} from '@jscpd/core';
import {jscpd} from 'jscpd';

const clones: Promise<IClone[]> = jscpd(process.argv);

jscpd async/await API

import {IClone} from '@jscpd/core';
import {jscpd} from 'jscpd';
(async () => {
  const clones: IClone[] = await jscpd(['', '', __dirname + '/../fixtures', '-m', 'weak', '--silent']);
  console.log(clones);
})();

detectClones API

import {detectClones} from "jscpd";

(async () => {
  const clones = await detectClones({
    path: [
      __dirname + '/../fixtures'
    ],
    silent: true
  });
  console.log(clones);
})()

detectClones with persist store

import {detectClones} from "jscpd";
import {IMapFrame, MemoryStore} from "@jscpd/core";

(async () => {
  const store = new MemoryStore<IMapFrame>();

  await detectClones({
    path: [
      __dirname + '/../fixtures'
    ],
  }, store);

  await detectClones({
    path: [
      __dirname + '/../fixtures'
    ],
    silent: true
  }, store);
})()

In case of deep customisation of detection process you can build your own tool: If you are going to detect clones in file system you can use @jscpd/finder for make a powerful detector. In case of detect clones in browser or not node.js environment you can build your own solution base on @jscpd/code

Changelog

Changelog

Who uses jscpd

  • Code-Inspector is a code analysis and technical debt management service.
  • Mega-Linter is a 100% open-source linters aggregator for CI (GitHub Action & other CI tools) or to run locally
  • vscode-jscpd VSCode Copy/Paste detector plugin.

Contributors

This project exists thanks to all the people who contribute.

Backers

Thank you to all our backers! 🙏 [Become a backer]

Sponsors

Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]

ga tracker

License

MIT © Andrey Kucherenko