npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@datadasher/mlb-scrape

v1.2.0

Published

## Description Using this module, you can get information on a players' transactions (when and when they were injured, in rehab, or changed teams). Note that this module is not dynamic to the page structure, meaning it will break if the HTML structure cha

Downloads

17

Readme

MLB Transactions Scraper

Description

Using this module, you can get information on a players' transactions (when and when they were injured, in rehab, or changed teams). Note that this module is not dynamic to the page structure, meaning it will break if the HTML structure changes on MLB.com.

Usage

The module exports two classes: Scraper and Parser.

Scraper

The Scraper class provides methods to scrape transaction history for baseball players. Here’s an example of how to use it:

const { Scraper } = require('@datadasher/mlb-scrape');
const scraper = new Scraper();

scraper.getTransactionsFromPlayerNames(["José Soriano", "Jhonathan Diaz", "Carlos Estévez"]).then(transactions => console.log(JSON.stringify(transactions, null, 2)));
/* Transaction history for the specified players:
[
  {
    "playerName": "José Soriano",
    "transactions": [
      {
        "date": "June 3, 2023",
        "event": "Los Angeles Angels recalled RHP José Soriano from Rocket City Trash Pandas."
      },
      {
        "date": "March 30, 2023",
        "event": "RHP José Soriano and  assigned to Rocket City Trash Pandas from Salt Lake Bees."
      },
  },
  {
    "playerName": "Jhonathan Diaz",
    "transactions": [
      {
        "date": "February 7, 2024",
        "event": "LHP Jhonathan Diaz assigned to Tacoma Rainiers."
      },
      {
        "date": "February 7, 2024",
        "event": "Seattle Mariners signed free agent LHP Jhonathan Diaz to a minor league contract and invited him to spring training."
      },
      {
        "date": "November 6, 2023",
        "event": "LHP Jhonathan Diaz elected free agency."
      },
      {
        "date": "October 16, 2023",
        "event": "Los Angeles Angels sent LHP Jhonathan Diaz outright to Salt Lake Bees."
      },
      {
        "date": "October 2, 2023",
        "event": "Los Angeles Angels recalled LHP Jhonathan Diaz from Salt Lake Bees."
      },
    ]
  },
  {
    "playerName": "Carlos Estévez",
    "transactions": [
      {
        "date": "July 10, 2023",
        "event": "RHP Carlos Estévez assigned to American League All-Stars."
      },
      {
        "date": "February 9, 2023",
        "event": "Dominican Republic activated RHP Carlos Estévez."
      },
      {
        "date": "December 5, 2022",
        "event": "Los Angeles Angels activated RHP Carlos Estévez."
      },
      {
        "date": "December 5, 2022",
        "event": "Los Angeles Angels activated RHP Carlos Estévez."
      },
    ]
  }
]
*/

Another usage:
const parseBot = new Parser();
parseBot.getTeamHistory(["Brandon Drury"], 2023)
    .then(result => console.log(result));
/*
{
  "Brandon Drury": [
    {
      "start_date": "June 20, 2010",
      "end_date": "January 24, 2013",
      "team": "Atlanta Braves",
      "league": "MLB"
    },
    {
      "start_date": "January 24, 2013",
      "end_date": "February 20, 2018",
      "team": "Arizona Diamondbacks",
      "league": "MLB"
    },
    {
      "start_date": "February 20, 2018",
      "end_date": "July 26, 2018",
      "team": "New York Yankees",
      "league": "MLB"
    },
    {
      "start_date": "July 26, 2018",
      "end_date": "October 6, 2020",
      "team": "Toronto Blue Jays",
      "league": "MLB"
    },
    {
      "start_date": "January 5, 2021",
      "end_date": "October 14, 2021",
      "team": "New York Mets",
      "league": "MLB"
    },
    {
      "start_date": "March 21, 2022",
      "end_date": "August 2, 2022",
      "team": "Cincinnati Reds",
      "league": "MLB"
    },
    {
      "start_date": "August 2, 2022",
      "end_date": "November 6, 2022",
      "team": "San Diego Padres",
      "league": "MLB"
    },
    {
      "start_date": "December 22, 2022",
      "end_date": "Knowledge cut-off date",
      "team": "Los Angeles Angels",
      "league": "MLB"
    }
  ]
}
*/

scraper2.getMlbUrl('Jordyn Adams').then(url => console.log(url));
// https://www.mlb.com/player/jordyn-adams-677941

Parser

The Parser class provides methods to analyze the injury history of baseball players. Here’s an example of how to use it:

const { Parser } = require('@datadasher/mlb-scrape');
const parser = new Parser();

const injuryAnalyzer = new Parser();
injuryAnalyzer.analyzeInjuries(["José Soriano", "Jhonathan Diaz", "Carlos Estévez"])
    .then(result => console.log(result));
/* Injury and rehab history:
{
  "Jose_Soriano": [
    {
      "start_injury": "April 5, 2022",
      "start_rehab": "July 28, 2022",
      "end_injury": "August 25, 2022",
      "reason": "Unknown"
    },
    {
      "start_injury": "February 17, 2021",
      "start_rehab": "May 20, 2021",
      "end_injury": "November 6, 2021",
      "reason": "Unknown"
    },
    {
      "start_injury": "July 14, 2019",
      "start_rehab": "August 6, 2019",
      "end_injury": "August 22, 2019",
      "reason": "Unknown"
    }
  ],
  "Jhonathan_Diaz": [
    {
      "start_injury": "August 3, 2022",
      "start_rehab": "Unknown",
      "end_injury": "November 10, 2022",
      "reason": "Unknown"
    },
    {
      "start_injury": "June 30, 2021",
      "start_rehab": "Unknown",
      "end_injury": "July 16, 2021",
      "reason": "Unknown"
    }
  ],
  "Carlos_Estevez": [
    {
      "start_injury": "September 27, 2022",
      "start_rehab": "Unknown",
      "end_injury": "October 6, 2022",
      "reason": "Unknown"
    },
    {
      "start_injury": "May 3, 2021",
      "start_rehab": "May 20, 2021",
      "end_injury": "May 22, 2021",
      "reason": "Right middle finger strain"
    },
    {
      "start_injury": "March 29, 2018",
      "start_rehab": "April 5, 2018",
      "end_injury": "June 25, 2018",
      "reason": "Left oblique strain, then right elbow strain"
    },
    {
      "start_injury": "May 19, 2014",
      "start_rehab": "Unknown",
      "end_injury": "June 23, 2014",
      "reason": "Unknown"
    }
  ]
}
*/

How to Install

  1. If you already have npm private, use npm install @datadasher/mlb-scrape
  2. Install Node
  3. In your terminal, create package.json using npm init -y
  4. Install Puppeteer, a web scraper, using npm install puppeteer
  5. If you are missing any libraries, try sudo apt-get install libnss3 libxss1 libasound2 libatk-bridge2.0-0 libgtk-3-0 libgbm-dev
  6. You also need to set up Wikidata and OpenAI (see below). We call the Wikidata API to get the player ID from a player name. This player ID is then used to get the player's unique URL on MLB.com for scraping information on their transactions. Then we use an LLM to parse out the information from their transactions.
  7. Create a .env file. Make sure you have done npm install dotenv In it, fill in the following
CLIENT_APP_KEY=''
CLIENT_APP_SECRET=''
ACCESS_TOKEN=''
OPENAI_API_KEY=''
  1. Start the application using node index

Setting Up Wikidata

https://www.wikidata.org/wiki/Wikidata:REST_API/Authentication

Setting Up OpenAI

https://platform.openai.com/docs/quickstart?context=node

Setting up OAuth 2.0 for Wikidata

  • To make authenticated requests against the Wikibase REST API for Wikidata, you must first set up an OAuth 2.0 client (formerly known as "consumer").
  • Log into meta.wikimedia.org using your unified login.
  • Create an OAuth 2.0 client.
  • (Get there by clicking on Special pages, then OAuth consumer registration, then Request a token for a new consumer.)
  • Supply the following information to the form:
  • Application name: Name it something informative. Example: "Wikibase REST API for Wikidata"
  • Application description: Again, use some informative text that explains how you intend to use the API. Example: "Wikibase REST API access for maintaining my dataset about animal cookies"
  • This consumer is for use only by (your name) (checkbox): Check this box under normal circumstances. See below for situations when you would leave this box unchecked.
  • Applicable grants (checkboxes): Check each box that describes a kind of access you need for your task.
  • By submitting this application... (checkbox): Read the user agreement and, if you agree to the terms, check the box.
  • Submit the form by clicking the "Propose consumer" button.
  • Save the three tokens provided on the next screen:
  • Client application key: used to obtain bearer tokens
  • Client application secret: used to obtain bearer tokens
  • Access token: provides access to the API when included in the API request (length: ~1800 characters)

Tips: HTML Structure

If the div structure happens to change on MLB.com, you can easily amend the getTransactionsFromUrl funcction: Open console, use Elements > Copy > Copy Selector to get the specific HTML on transactions from the page, and adjust the code to match that structure.

Tips: Transaction Prompting

We want to determine which days a player was injured. Examples would include:

  • “placed on the 10-day injured list”: This indicates that the player has been injured and is expected to be out for at least 10 days. The injury could be longer depending on the player’s recovery.
  • “placed on the 60-day injured list”: This indicates a more serious injury that will keep the player out for at least 60 days.
  • “sent on a rehab assignment”: This indicates that the player is recovering from an injury and is starting to play in minor league games as part of their rehabilitation.
  • “activated from the injured list”: This indicates that the player has recovered from their injury and is ready to return to the major league roster.
  • “transferred to the 60-day injured list”: This usually indicates that the player’s injury is more serious than initially thought, or that the player has suffered a setback in their recovery.