npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@adguard/tsurlfilter

v3.0.6

Published

This is a TypeScript library that implements AdGuard's content blocking rules

Downloads

1,741

Readme

TSUrlFilter

npm-badge license-badge

This is a TypeScript library that implements AdGuard's content blocking rules.

Idea

The idea is to have a single library that we can reuse for the following tasks:

Installation

You can install the package via:

  • Yarn: yarn add @adguard/tsurlfilter
  • NPM: npm install @adguard/tsurlfilter
  • PNPM: pnpm install @adguard/tsurlfilter

API description

Public properties

TSURLFILTER_VERSION

type: string

Version of the library.

Public classes

Engine

Engine is a main class of this library. It represents the filtering functionality for loaded rules

Constructor
/**
 * Creates an instance of Engine
 * Parses filtering rules and creates a filtering engine of them
 *
 * @param ruleStorage storage
 * @param configuration optional configuration
 *
 * @throws
 */
constructor(ruleStorage: RuleStorage, configuration?: IConfiguration | undefined);
matchRequest
/**
 * Matches the specified request against the filtering engine and returns the matching result.
 * In case frameRules parameter is not specified, frame rules will be selected matching request.sourceUrl.
 *
 * @param request - request to check
 * @param frameRules - source rules or undefined
 * @return matching result
 */
matchRequest(request: Request, frameRule: NetworkRule | null = null): MatchingResult;
matchFrame
/**
  * Matches current frame and returns document-level allowlist rule if found.
  *
  * @param frameUrl
  */
matchFrame(frameUrl: string): NetworkRule | null;
Starting engine
import {
    BufferRuleList,
    Engine,
    FilterListPreprocessor,
    RuleStorage,
    setConfiguration,
} from '@adguard/tsurlfilter';

const rawFilter = [
  '[AdGuard]',
  '! Title: Example filter',
  '! Description: This is just an example filter.',
  'example.com##h1',
].join('\n');

// Practically, you can read this data from the storage
const processedFilter = FilterListPreprocessor.preprocess(rawFilter);
const list = new BufferRuleList(0, processedFilter.filterList, false, false, false, processedFilter.sourceMap);
const ruleStorage = new RuleStorage([list]);

const config = {
    engine: 'extension',
    version: '1.0.0',
    verbose: true,
};

setConfiguration(config);

const engine = new Engine(ruleStorage);

console.log(`Engine loaded with ${engine.getRulesCount()} rule(s)`);
Matching requests
const request = new Request(url, sourceUrl, RequestType.Document);
const result = engine.matchRequest(request);
Retrieving cosmetic data
const cosmeticResult = engine.getCosmeticResult(request, CosmeticOption.CosmeticOptionAll);

MatchingResult

MatchingResult contains all the rules matching a web request, and provides methods that define how a web request should be processed

getBasicResult
/**
 * GetBasicResult returns a rule that should be applied to the web request.
 * Possible outcomes are:
 * returns null -- bypass the request.
 * returns a allowlist rule -- bypass the request.
 * returns a blocking rule -- block the request.
 *
 * @return basic result rule
 */
getBasicResult(): NetworkRule | null;
getCosmeticOption

This flag should be used for getCosmeticResult(request: Request, option: CosmeticOption)

/**
  * Returns a bit-flag with the list of cosmetic options
  *
  * @return {CosmeticOption} mask
  */
getCosmeticOption(): CosmeticOption;
Other rules
/**
  * Return an array of replace rules
  */
getReplaceRules(): NetworkRule[]

/**
  * Returns an array of csp rules
  */
getCspRules(): NetworkRule[]

/**
  * Returns an array of cookie rules
  */
getCookieRules(): NetworkRule[]

CosmeticResult

Cosmetic result is the representation of matching cosmetic rules. It contains the following properties:

/**
 * Storage of element hiding rules
 */
public elementHiding: CosmeticStylesResult;

/**
 * Storage of CSS rules
 */
public CSS: CosmeticStylesResult;

/**
 * Storage of JS rules
 */
public JS: CosmeticScriptsResult;

/**
 * Storage of Html filtering rules
 */
public Html: CosmeticHtmlResult;

/**
 * Script rules
 */
public getScriptRules(): CosmeticRule[];
Applying cosmetic result - CSS
const css = [...cosmeticResult.elementHiding.generic, ...cosmeticResult.elementHiding.specific]
        .map((rule) => `${rule.getContent()} { display: none!important; }`);

const styleText = css.join('\n');
const injectDetails = {
    code: styleText,
    runAt: 'document_start',
};

chrome.tabs.insertCSS(tabId, injectDetails);
Applying cosmetic result - scripts
const cosmeticRules = cosmeticResult.getScriptRules();
const scriptsCode = cosmeticRules.map((x) => x.getScript()).join('\r\n');
const toExecute = buildScriptText(scriptsCode);

chrome.tabs.executeScript(tabId, {
    code: toExecute,
});

DnsEngine

DNSEngine combines host rules and network rules and is supposed to quickly find matching rules for hostnames.

Constructor
/**
  * Builds an instance of dns engine
  *
  * @param storage
  */
constructor(storage: RuleStorage);
match
/**
 * Match searches over all filtering and host rules loaded to the engine
 *
 * @param hostname to check
 * @return dns result object
 */
public match(hostname: string): DnsResult;
Matching hostname
const dnsResult = dnsEngine.match(hostname);
if (dnsResult.basicRule && !dnsResult.basicRule.isAllowlist()) {
    // blocking rule found
    ...
}

if (dnsResult.hostRules.length > 0) {
    // hosts rules found
    ...
}

RuleSyntaxUtils

This module is not used in the engine directly, but it can be used in other libraries

Public methods
/**
 * Checks if rule can be matched by domain
 *
 * @param node Rule node
 * @param domain Domain to check
 */
public static isRuleForDomain(node: AnyRule, domain: string): boolean
/**
 * Checks if rule can be matched by URL
 *
 * @param node Rule node
 * @param url URL to check
 */
public static isRuleForUrl(node: AnyRule, url: string): boolean;

FilterListPreprocessor

Provides a utility to process filter lists before using them in the engine.

Understanding the data structure
Requirements

The processed data structure must meet two requirements:

  1. Provide the data necessary for the operation of the filtering engine:
    • Binary serialized filter list without invalid rules and comments
  2. Make it possible to display the applied rules (and their original equivalent) in the filtering log (only for debugging purposes):
    • Source map (maps the start index in the serialized buffer to the line start index in the raw filter list)
    • Raw filter list
    • Conversion map (maps the start index in the raw filter list to the original rule, if the rule was converted)
PreprocessedFilterList interface
/**
 * Represents a preprocessed filter list.
 */
interface PreprocessedFilterList {
    /**
     * Raw filter list, but rules are converted to the AdGuard format.
     */
    rawFilterList: string;

    /**
     * Serialized version of the rawFilterList.
     */
    filterList: Uint8Array[];

    /**
     * Mapping between the original and converted rules.
     * Key is the line start index in the rawFilterList, value is the converted rule.
     */
    conversionMap: Record<string, string>;

    /**
     * Source map for the rawFilterList. Key is the start index in the serialized buffer, value is the line start index in the rawFilterList.
     */
    sourceMap: Record<string, number>;
}
Example to illustrate the requirements

Suppose we want to use the following filter list:

! Title: Test filter list
example.com##+js(set, foo, 1)
example.com##h1

As you can see, the filter list contains two rules:

  • first is a uBO rule, and
  • the second is a common rule (which is compatible with AdGuard).

Before loading this list into the engine, we have to convert its rules to AdGuard format. So we need such a rawFilterList:

! Title: Test filter list
example.com#%#//scriptlet('ubo-set', 'foo', '1')
example.com##h1

and its binary serialized form, which will be used by the engine: filterList.

When we initialize the engine, it scans the byte buffer filterList and builds the filtering engine, lookup tables, etc. and assigns the start indexes from the byte buffer filterList to the TSUrlFilter rule instances.

We don't know anything about the original rules at the engine level and we don't need to know it.

However, on the extension level, we know all of the properties of the preprocessed filter list. When a rule is applied, engine reports us the applied TSUrlFilter rule instance and its start index in the byte buffer filterList.

So, when we want to display the applied rule in the filtering log, we can do the following:

  1. We will receive the byte buffer start index for the applied rule from the engine.

  2. To get the applied rule text, we will use the sourceMap property to map the byte buffer start index to the line start index in the rawFilterList,

    • We can do this with the getRuleSourceIndex helper method:

      const lineStartIndex = getRuleSourceIndex(byteBufferStartIndex, sourceMap);
    • We can get rule text from the rawFilterList by the line start index with the getRuleSourceText helper function:

      const ruleText = getRuleSourceText(lineStartIndex, rawFilterList);
  3. To get the original rule text, we will use the conversionMap property to find the original rule text from the line start index in the rawFilterList, if we have a key in the conversionMap for the start index, we will get the original rule text

For example, for the first rule, the applied rule text will be example.com#%#//scriptlet('ubo-set', 'foo', '1') and the original rule text will be example.com##+js(set, foo, 1). For the second rule, the applied rule text will be example.com##h1 and the original rule text will be undefined, which means that the rule was not converted.

Public methods
/**
 * Processes the raw filter list and converts it to the AdGuard format.
 *
 * @param filterList Raw filter list to convert.
 * @param parseHosts If true, the preprocessor will parse host rules.
 * @returns A {@link PreprocessedFilterList} object which contains the converted filter list,
 * the mapping between the original and converted rules, and the source map.
 */
public static preprocess(filterList: string, parseHosts = false): PreprocessedFilterList;
/**
 * Gets the original filter list text from the preprocessed filter list.
 *
 * @param preprocessedFilterList Preprocessed filter list.
 * @returns Original filter list text.
 */
public static getOriginalFilterListText(preprocessedFilterList: RawListWithConversionMap): string;
/**
 * Gets the original rules from the preprocessed filter list.
 *
 * @param preprocessedFilterList Preprocessed filter list.
 * @returns Array of original rules.
 */
public static getOriginalRules(preprocessedFilterList: RawListWithConversionMap): string[];

where

type RawListWithConversionMap = Pick<PreprocessedFilterList, 'rawFilterList' | 'conversionMap'>;

DeclarativeFilterConverter

Converts a list of IFilters to a single rule set or to a list of rule sets. See examples/manifest-v3/ for an example usage.

Public methods
/**
 * Extracts content from the provided static filter and converts to a set
 * of declarative rules with error-catching non-convertible rules and
 * checks that converted ruleset matches the constraints (reduce if not).
 *
 * @param filterList List of {@link IFilter} to convert.
 * @param options Options from {@link DeclarativeConverterOptions}.
 *
 * @throws Error {@link UnavailableFilterSourceError} if filter content
 * is not available OR some of {@link ResourcesPathError},
 * {@link EmptyOrNegativeNumberOfRulesError},
 * {@link NegativeNumberOfRegexpRulesError}.
 * @see {@link DeclarativeFilterConverter#checkConverterOptions}
 * for details.
 *
 * @returns Item of {@link ConversionResult}.
 */
convertStaticRuleSet(
    filterList: IFilter,
    options?: DeclarativeConverterOptions,
): Promise<ConversionResult>;
/**
 * Extracts content from the provided list of dynamic filters and converts
 * all together into one set of rules with declarative rules.
 * During the conversion, it catches unconvertible rules and checks if
 * the converted ruleset matches the constraints (reduce if not).
 *
 * @param filterList List of {@link IFilter} to convert.
 * @param staticRuleSets List of already converted static rulesets. It is
 * needed to apply $badfilter rules from dynamic rules to these rules from
 * converted filters.
 * @param options Options from {@link DeclarativeConverterOptions}.
 *
 * @throws Error {@link UnavailableFilterSourceError} if filter content
 * is not available OR some of {@link ResourcesPathError},
 * {@link EmptyOrNegativeNumberOfRulesError},
 * {@link NegativeNumberOfRegexpRulesError}.
 * @see {@link DeclarativeFilterConverter#checkConverterOptions}
 * for details.
 *
 * @returns Item of {@link ConversionResult}.
 */
convertDynamicRuleSets(
    filterList: IFilter[],
    staticRuleSets: IRuleSet[],
    options?: DeclarativeConverterOptions,
): Promise<ConversionResult>;
Example of use
import { CompatibilityTypes, FilterListPreprocessor, setConfiguration } from '@adguard/tsurlfilter';
import { DeclarativeFilterConverter, Filter } from '@adguard/tsurlfilter/es/declarative-converter';

const rawFilter1 = [
    '||example.com^$document',
    '/ads.js^$script,third-party,domain=example.com|example.net',
].join('\n');

const rawFilter2 = [
    '||example.com^$document',
    '-ad-350-',
    // flags second rule from rawFilter1 as badfilter
    '/ads.js^$script,third-party,domain=example.com|example.net,badfilter',
].join('\n');

setConfiguration({
    engine: 'extension',
    version: '3',
    verbose: true,
    compatibility: CompatibilityTypes.Extension,
});

const converter = new DeclarativeFilterConverter();

let filterId = 0;

;(async () => {
    const { ruleSet: staticRuleSet } = await converter.convertStaticRuleSet(
        new Filter(filterId++, {
            getContent: () => Promise.resolve(FilterListPreprocessor.preprocess(rawFilter1)),
        }),
    );

    // get declarative rules from static rule set
    console.log(await staticRuleSet.getDeclarativeRules());

    const { declarativeRulesToCancel, ruleSet: dynamicRuleSet } = await converter.convertDynamicRuleSets(
        [
            new Filter(filterId++, {
                getContent: () => Promise.resolve(FilterListPreprocessor.preprocess(rawFilter2)),
            }),
        ],
        [staticRuleSet],
    );

    // will print rule from rawFilter1 which flagged as badfilter
    console.log(declarativeRulesToCancel);

    // get declarative rules from dynamic rule set
    console.log(await dynamicRuleSet.getDeclarativeRules());
})();
Declarative converter documentation

For more information about the declarative converter, see the its documentation.

Problems

QueryTransform

  • Regexp is not supported in remove params
  • We cannot implement inversion in remove params
  • We cannot filter by request methods
  • Only one rule applies for a redirect. For this reason, different rules with the same url may not work. Example below:
! Works
||testcases.adguard.com$removeparam=p1case6|p2case6

! Failed
||testcases.adguard.com$removeparam=p1case6

! Works
||testcases.adguard.com$removeparam=p2case6

Development

This project is part of the @adguard/extensions monorepo. It is highly recommended to use lerna for commands as it will execute scripts in the correct order and can cache dependencies.

npx lerna run --scope=@adguard/tsurlfilter:<script>

NPM scripts

  • start: Run build in watch mode
  • test:watch: Run test suite in interactive watch mode
  • test:prod: Run linting and generate coverage
  • build: Generate bundles and typings, create docs
  • lint: Lints code

Excluding peerDependencies

On library development, one might want to set some peer dependencies, and thus remove those from the final bundle. You can see in Rollup docs how to do that.

Good news: the setup is here for you, you must only include the dependency name in external property within rollup.config.js. For example, if you want to exclude lodash, just write there external: ['lodash'].

Git Hooks

There is already set a precommit hook for formatting your code with Eslint :nail_care: