robot-directives

v0.4.0

Published

2 years ago

Parse robot directives within HTML meta and/or HTTP headers.

Downloads

56,232

0High
0Medium
0Low

stevenvachon

crawlers header html http meta metadata nofollow noindex robots robots.txt seo spiders

robot-directives

Parse robot directives within HTML meta and/or HTTP headers.

<meta name="robots" content="noindex,nofollow">
X-Robots-Tag: noindex,nofollow
etc

Note: this library is not responsible for parsing any HTML.

Installation

Node.js >= 6 is required. To install, type this at the command line:

npm install robot-directives

Usage

const RobotDirectives = require('robot-directives');

const robots = new RobotDirectives(options)
  header('googlebot: noindex')
  meta('bingbot', 'unavailable_after: 1-Jan-3000 00:00:00 EST')
  meta('robots', 'noarchive,nocache,nofollow');

robots.is(RobotDirectives.NOFOLLOW);
//-> true

robots.is([ RobotDirectives.NOFOLLOW, RobotDirectives.FOLLOW ]);
//-> false

robots.isNot([ RobotDirectives.ARCHIVE, RobotDirectives.FOLLOW ]);
//-> true

robots.is(RobotDirectives.NOINDEX, {
  currentTime: () => new Date('jan 1 3001').getTime(),
  userAgent: 'Bingbot/2.0'
});
//-> true

RobotDirectives.isBot('googlebot');
//-> true

Constants

For use in comparison, the following directives are available as static properties on the constructor:

ALL
ARCHIVE
CACHE
FOLLOW
IMAGEINDEX
INDEX
NOARCHIVE
NOCACHE
NOFOLLOW
NOIMAGEINDEX
NOINDEX
NONE
NOODP
NOSNIPPET
NOTRANSLATE
ODP
SNIPPET
TRANSLATE

Methods

`header(value)`

Parses, stores and cascades the value of an X-Robots-Tag HTTP header.

`is(directive[, options])`

Validates a directive or a list of directives against parsed instructions. directive can be a String or an Array. options, if defined, will override any such defined in the constructor during instantiation. A value of true is returned if all directives are valid.

`isNot(directive[, options])`

Inversion of is(). A value of true is returned if all directives are not valid.

`meta(name, content)`

Parses, stores and cascades the data within a <meta> HTML element.

`oneIs(directives[, options])`

A variation of .is(). A value of true is returned if at least one directive is valid.

`oneIsNot(directives[, options])`

Inversion of oneIs(). A value of true is returned if at least one directive is not valid.

Functions

`isBot(botname)`

Returns true if botname is a valid bot/crawler/spider name or user-agent.

Options

`allIsReadonly`

Type: Boolean
Default value: true
Declaring the 'all' directive will not affect other directives when true. This is how most search crawlers perform.

`currentTime`

Type: Function
Default value: function(){ return Date.now() }
The date to use when checking if unavailable_after has expired.

`restrictive`

Type: Boolean
Default value: true
Directive conflicts will be resolved by selecting the most restrictive value. Example: 'noindex,index' will resolve to 'noindex' because it is more restrictive. This is how Googlebot behaves, but others may differ.

`userAgent`

Type: String
Default value: ''
The HTTP user-agent to use when retrieving instructions via is() and isNot().

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

robot-directives

Installation

Usage

Constants

Methods

header(value)

is(directive[, options])

isNot(directive[, options])

meta(name, content)

oneIs(directives[, options])

oneIsNot(directives[, options])

Functions

isBot(botname)

Options

allIsReadonly

currentTime

restrictive

userAgent

`header(value)`

`is(directive[, options])`

`isNot(directive[, options])`

`meta(name, content)`

`oneIs(directives[, options])`

`oneIsNot(directives[, options])`

`isBot(botname)`

`allIsReadonly`

`currentTime`

`restrictive`

`userAgent`