seekr
v1.4.3
Published
A package for detecting keywords on the Internet Computer
Downloads
9
Maintainers
Readme
seekr
A tool for finding and pages that contain words or phrases in the DOM.
Overview
seekr is a tool for finding and pages that contain words or phrases in the DOM. It is a command line tool that takes a list of words or phrases in a file, and traverses pages in search of those terms in the DOM. There is a feature that will expand each word in the list to its additions, subtractions, substitutions, and transpositions, and search for those as well. The tool will output a list of pages that contain the terms.
seekr is currently wired to the Internet Computer as its source of pages to search. It can be easily modified to search other sources.
Requirements
seekr requires:
Installation
From the root of the project, run:
yarn
Usage
Create a file with a list of words or phrases to search for. For example, dictionary.txt
:
cabbage
lettuce
Create a file with a list of domains that are crawlable. For example, interesting_dmains.txt
:
google.com
wikipedia.org
If you want to exclude certain words from triggering a result, create a excluded_words.txt
file
and add the words to it:
pizza
salt
Links in the interesting_domains.txt
file should be in the format domain.com
or subdomain.domain.com
.
Any links found in the crawl that are in the interesting_domains.txt
file will be searched as well.
npm run cli seek