@eyeo/mlaf

v0.5.1

Published

5 months ago

Utility library for machine learning-based ad filtering

Downloads

5,909

0High
0Medium
0Low

mariokoe

adblocker ABP eyeo

Overview

Utility library for machine learning-based ad filtering in webextensions.

Installation

npm i @eyeo/mlaf

Package Contents

service: service worker/background script component for model management and inference
hideIfClassifies: content script for data preprocessing and ad filtering
tfjsPartial: a reduced version of TFJS' WebGL backend (pre-bundled with `service) based on TFJS 3.19
graphMLUtils: general purpose machine learning library for data preprocesing, model handling, TFJS handling

Minified Exports

import { service, hideIfClassifies, tfjsPartial, graphMLUtils } from "@eyeo/mlaf";

Human-Readable Code

If it is desired to import or review human-readable code, see /src/**/*.src.js. They are exported via @eyeo/mlaf/src:

import { service, hideIfClassifies, tfjsPartial, graphMLUtils } from "@eyeo/mlaf/src";

Note, even in their human-readable format, service and tfjsPartial come pre-bundled with a heavily reduced and minified version of TFJS' WebGL backend in order to reduce the library's size. For a human-readable version of TFJS' WebGL backend see tfjs-backend-webgl.

Usage

Usage in Ad Blockers / Webextensions

While the library is platform-agnostic, it was initially designed to be integrated in ABP, forks of ABP and webextensions in general. This section outlines how to integrate the library in Extensions. Skip to Usage in Other Projects.

Integration in Service Workers or Background Scripts

Establish a message listener for the content script to send inference requests to:

import { service } from "@eyeo/mlaf";

browser.runtime.onMessage.addListener(service.messageListener);

Alternatively, if you need more control over messages, you can forward only specific messages:

import { service } from "@eyeo/mlaf"

browser.runtime.onMessage.addListener((request, sender, sendResponse) => {
  if (request && typeof request.type === "string" && request.type.startsWith(service.MESSAGE_PREFIX)) {
    service.digestMessage(request)
      .then(sendResponse)
      .catch(sendResponse);
    return true;
  }
  return false;
});

Configuration

The service component of this library provides a simple interface to configure certain functionality. Those flags can be set during initialization or at runtime. See Telemetry for telemetry-related details.

import { service } from "@eyeo/mlaf";

service.setOptions({
  // Turn on/off telemetry.
  // Default: true
  telemetryOptOut: true,

  // Turn on/off telemetry in private browsing/incognito mode.
  // Default: false
  privateBrowsingTelemetry: true

  // Change telemetry report probability (float between 0.0 - 1.0)
  // This value overrides the config-dictated probabilities embedded in the model of how often telemetry reports are sent.
  // A value of 0 indicates a 0% chance for reports to be sent every time inference is run.
  // A value of 0.5 indicates a 50% chance for reports to be sent every time inference is run.
  reportProbability: 0.5,

  // Change probability of telemetry reports to contain inference-related inputs (float between 0.0 - 1.0)
  // This value overrides the config-dictated probabilities embedded in the model of how often telemetry reports contain feature matrices.
  // A value of 0 indicates a 0% chance for telemetry reports to contain feature matrices.
  // A value of 0.5 indicates a 50% chance for telemetry reports to contain feature matrices.
  // Feature matrices aren't small in size (~50kb - 200kb) which is why they're limited to their own probability.
  featureReportProbability: 0.1,

  // Turn on/off allow listing/acceptable ads support when using ML to filter ads.
  // For more on acceptable ads see https://acceptableads.com/
  // If the option is set, allow listing will be performed in accordance with the option's return value.
  // If the option is not set, allow listing/acceptable ads is not supported.
  // The value is expected to be a function in accordance with webext-ad-filtering-solution's getAllowingFilters API
  // https://eyeo.gitlab.io/adblockplus/abc/webext-ad-filtering-solution/#filters
  // Default: undefined (allow listing turned off)
  exceptionRules: ewe.filters.getAllowingFilters || (tdbId) => myAllowlistingFunction(tabId) && [allowingFilter] : [];
});

browser.runtime.onMessage.addListener(service.messageListener);

Options can be set during initialization and/or at runtime:

import { service } from "@eyeo/mlaf";

// Set initial state
service.setOptions({ telemetryOptOut: someUserOptOutConfig });
browser.runtime.onMessage.addListener(service.messageListener);

// Update at runtime
function onOptOutChanged(val) {
  service.setOptions({ telemetryOptOut: val });
}

Integration in Content Scripts

General Usage in Content Scripts

Load content script component and inject into websites:

Content Script

import { hideIfClassifies } from "@eyeo/mlaf";

let modelName = "model-1.0.0";
let selector = ".example-selector"

hideIfClassifies(modelName, selector);

Set modelName to the desired model (automatically downloaded from https://easylist-downloads.adblockplus.org/models) and set selector to a CSS3 selector matching all DOM elements you want to run inference on.

manifest.json

"content_scripts": [
  {
    "matches": ["<all_urls>"],
    "js": ["content_script.js"],
    "all_frames": true
  }
],

Example Integration in ABP Snippets:

Load and export content script in the desired bundle:

import { hideIfClassifies } from "@eyeo/mlaf";

export const snippets = {
...
  "hideIfClassifies": hideIfClassifies,
...
};

Usage in Other Projects

Usage of the library in projects not based on webextensions:

import { tfjsPartial, graphMLUtils } from "@eyeo/mlaf";

// Model bundle consisting of model weights, model topology and preprocessing configuration.
let modelBundle = JSON.parse(modelFile);

// Dom element to run inference on
let domElement = document.querySelector("#example");

// Instantiate model from a model bundle and run inference.
let modelInstance = await graphMLUtils.loadBundledModel(tfjsPartial, modelBundle);
graphMLUtils.inference(tfjsPartial, modelInstance, modelBundle, domElement)
  .then(prediction => console.log("Is ad:", prediction));

// The "inference" function is a shorthand data preprocessing, inference and prediction.
// This can be done manually in order to change or inspect data in each respective step:
let modelInstance = await graphMLUtils.loadBundledModel(tfjsPartial, modelBundle);
graphMLUtils.domToGraph(modelBundle, domElement)
  .then(graph => graphMLUtils.preprocessGraph(modelBundle, graph, domain))
  .then(preprocessedGraph => graphMLUtils.predict(tfjsPartial, modelInstance, modelBundle, preprocessedGraph))
  .then(predictions => graphMLUtils.digestPrediction(predictions))
  .then(prediction => console.log("Is ad:", prediction));

Additional Information

Telemetry

As outlined under Configuration, the library has the ability to send telemetry reports in order to monitor model performance. Telemetry is only active if the library is used with models provided by eyeo, which come with a telemetry configuration. Find technical details and data disclosure here.

If you're using the library with your own models, no telemetry will be triggered.

Code Dependencies

The library has no dependencies. Note, however, service and tfjsPartial come pre-bundled with a custom version of TFJS based on TFJS v3.19 which only hosts a WebGL backend and which is reduced to the minimum set of functions to run graph neural networks in order to keep the library's size as small as possible.

Webextension Requirements

Required Extension Permissions

No specific extension permissions are required for the library to function.

If the content script component is injected into websites, the extension may require the following permissions:

scripting (MV3)
tabs (MV2)

Usage of Shared Resources

The following browser/webextension APIs are used:

IndexedDB
browser.runtime.sendMessage
browser.runtime.onMessage.addListener

IndexedDB is used within the background component in service. It is not accessed/exposed via the content script. The name of the database is ml.

Messaging via browser.runtime is used for communication between the content script and background component. All messages contain a field type with the message prefix ML:. Example: { type: "ML:prepare", model: "modelName" }.

Compatibility Information

Minimum browser versions

Google Chrome v77
Mozilla Firefox v63
Opera v64
Microsoft Edge v79

Webextension manifest versions

When used with ABP Snippets:

Snippets v0.9.0

Development environment

Node v16 or higher
NPM v7 or higher
Compatible with webextension-polyfill v0.8 or higher

Other

Release notes