dead-or-alive
v1.0.4
Published
check if urls are dead or alive
Downloads
6,658
Maintainers
Readme
dead-or-alive
Check if urls are dead or alive.
Contents
What is this?
The main goal of this project is to make sure URLs in docs are okay. As in, “if I link to it and a user goes there, will it work, and will they be there or somewhere else?” It uses modern JavaScript that works everywhere, and it uses the DOM when available, so you can use it in browsers, Node.js, workers.
It is made to be used by other tools.
Such as a markdown tool,
remark-lint-no-dead-urls
.
- [x] works in browser, node, worker, etc
- [x] checks if anchors (such as
#readme
) point to elements - [x] follows HTTP and HTML redirects
- [x] supports DOM clobber prefix (
user-content-
) - [x] emits warnings
When should I use this?
You can use this when you want to build a link checker.
If you want to support SPAs or other client-side JS, this doesn’t do that.
Install
This package is ESM only. In Node.js (version 18+), install with npm:
npm install dead-or-alive
In Deno with esm.sh
:
import {deadOrAlive} from 'https://esm.sh/dead-or-alive@1'
In browsers with esm.sh
:
<script type="module">
import {deadOrAlive} from 'https://esm.sh/dead-or-alive@1?bundle'
</script>
Use
import {deadOrAlive} from 'dead-or-alive'
console.log(await deadOrAlive('https://something-that-is-dead.com'))
console.log(await deadOrAlive('https://github.com'))
…yields:
{
messages: [
[1:1: Unexpected error fetching `https://something-that-is-dead.com/`] { … }
],
status: 'dead',
url: undefined,
urls: undefined
}
{
messages: [],
status: 'alive',
url: 'https://github.com/',
urls: Set(203) {
'https://github.githubassets.com/',
'https://avatars.githubusercontent.com/',
…
}
}
API
This package exports the identifiers
deadOrAlive
,
defaultAnchorAllowlist
,
and defaultSleep
.
It exports the TypeScript types
AnchorAllow
,
Options
,
Result
,
and Sleep
.
There is no default export.
deadOrAlive(href[, options])
Check if a url is dead or alive.
Parameters
href
(URL
orstring
) — URLoptions
(Options
, optional) — configuration
Returns
Result (Promise<Result>
).
Notes
To improve performance,
decrease maxRetries
and/or decrease the value used
for sleep
.
The normal behavior is to assume connections might be flakey and to sleep a
while and retry a couple times.
If you do not care about HTML redirects,
whether anchors work,
and what further URLs are used on,
you can pass checkAnchor: false
,
findUrls: false
,
and followMetaHttpEquiv: false
,
which enables a fast path without parsing HTML.
defaultAnchorAllowlist
Allow certain anchors (Array<AnchorAllow>
).
This currently allows text fragments everywhere.
defaultSleep(retries)
Calculate milliseconds to sleep between tries.
The function is defined as x ** 3 * 1000
,
so the first sleep is 1 ** 3 * 1000
is 1s,
2nd is 8s,
3rd is 27s,
etc.
Parameters
retries
(number
) — try
Returns
Milliseconds to sleep (number
).
AnchorAllow
Allow extra anchors (TypeScript type).
The first item is a regular expression to match URLs (origin and path,
so without search or hash),
and the second item is a regular expression to match hashes (without #
).
When both match,
the hash is allowed.
Type
export type AnchorAllow = [url: RegExp, anchor: RegExp]
Options
Configuration (TypeScript type).
Fields
anchorAllowlist
(Array<AnchorAllow>
, default:true
) — allow anchors; each tuple is checked to match URLs (origin and path, so without search or hash), and then to match hashes (without#
); when both match, the hash is allowed, and nomissing-anchor
error is usedcheckAnchor
(boolean
, default:true
) — check whether URL hashes point to elementsfollowMetaHttpEquiv
(boolean
, default:true
) — follow HTML redirects; a<meta content=0;to http-equiv=refresh>
can be useful for static sites such as those on GH pagesfindUrls
(boolean
, default:true
) — find URLs in the final resource; currently applies to HTMLmaxRedirects
(number
, default:5
) — inclusive maximum redirects to followmaxRetries
(number
, default:1
) — inclusive maximum number to try again on failuresresolveClobberPrefix
(boolean
, default:true
) — acceptuser-content-
prefix inid
on elementssleep
(Sleep
, default:defaultSleep
) — calculate milliseconds to sleep between triestimeout
(number
, default:3000
) — timeout for HTTP request in millisecondsuserAgent
(string
, default:'Mozilla/5.0 … Safari/537.36'
, a modern Chrome on macOS user agent) — user agent
Result
Result (TypeScript type).
Fields
messages
(Array<VFileMessage>
) — messages where the first is a fatal error when deadpermanent
(boolean
orundefined
) — whether all redirects were permanentstatus
('alive'
or'dead'
) — statusurl
(string
orundefined
) — final URL if aliveurls
(Set<string>
orundefined
) — further URLs iffindUrls: true
and the resource was HTML
Sleep
Calculate milliseconds to sleep between tries (TypeScript type).
Parameters
retries
(number
) — try
Returns
Milliseconds to sleep (number
).
Errors
dead
Unexpected not ok response `$status` (`$statusText`) on `$url`
This error is used when a URL looks temporarily or permenantly dead.
fetch
Unexpected error fetching `$url`
This error is used when for some unknown reason the URL is dead. It might be that you are offline, or the URL is temporarily dead, or something else is wrong. But it looks dead.
lost-hash-with-meta-http-equiv
Unexpected hash in URL `$url` that redirects with `meta[http-equiv=refresh]` to `$url` losing the hash, remove the hash from the original URL
This warning is used when a URL with an anchor gets an HTML redirect, which looses that anchor. You can remove the hash from the original URL. Or find a similar section on the final URL.
lost-hash-with-non-html
Unexpected hash in URL `$url` to non-html ($contentType) losing the hash, remove the hash from the original URL
This warning is used when a URL with an anchor resolves to some non-HTML thing, which might be an error. Perhaps you can remove the hash from the original URL?
lost-hash-with-redirect
Unexpected hash in URL `$url` that redirects to `$url` losing the hash, remove the hash from the original URL
This warning is used when a URL with an anchor gets an HTTP redirect, which looses that anchor. You can remove the hash from the original URL. Or find a similar section on the final URL.
max-redirect
Unexpected redirect to `$url`, too many redirects
This error is used when there are more redirects than
options.maxRedirects
.
Increase this number or use your browser to see if the URL ends up redirecting
to something that works.
missing-anchor
Unexpected missing anchor element on `$url` for fragment `$fragment`, remove if unneeded or refer to an existing element
This error is used when a hash is used in a URL but no such element can be
found.
Elements can match hashes by their id
or by their name
(when <a>
).
The prefix user-content-
is ignored on elements as that’s what GitHub and
rehype-sanitize
use to solve DOM clobbering.
Remove the hash if unneeded or refer to a different, existing element.
shared-declarative-refresh
Unexpected invalid URL `$url` in `content` on `meta[http-equiv=refresh] relative to `$url`
This error is used when an HTML redirect such as
<meta content=0;xxx http-equiv=refresh>
is used,
but xxx
does not parse as a URL.
Use your browser to see if the URL ends up redirecting to something that works.
Compatibility
This projects is compatible with maintained versions of Node.js.
When we cut a new major release,
we drop support for unmaintained versions of Node.
This means we try to keep the current release line,
dead-or-alive@1
,
compatible with Node.js 18.
Security
This package is safe.
Contribute
Yes please! See How to Contribute to Open Source.