identify-media

v0.3.19

Published

2 years ago

Analyse file path and content to make search criteria for media APIs

Downloads

0High
0Medium
0Low

tanyaran

media opensubtitles tmdb omdb imdb

Analyse Media File

This library is written to help streamline getting information about, and subtitles for, backups of DVDs and BluRays. Using only string methods, in order to be compatible with both browser and desktop applications, to build search query data for various media API backends.

The library is written in TypeScript and includes types easily converted to TMDB, OMDB and IMDB search queries as well as media hash coding compatible with OpenSubtitles API.

Analysing Functions

makeHash returns a promise of a 16 length hex string that can be used in various media APIs to search for content about a media file. Most commonly it is used to find subtitles via https://opensubtitles.org when your DVD backup didn't have subtitles included in your preferred language.

analyseFilePath returns a structured output of included type AnalysedMedia which is a Union of AnalysedMovie, AnalysedTVShow or a string. The string will include just the guessed at title if it was unable to identify the name and path formatting from your backup program.

AnalysedMovie is structured as follows:

export interface AnalysedMovie {
  type: 'movie';
  year?: number;
  name: string;
}

name will be the guessed at name, based on common patterns from backup programs, and year will be the guessed at release year.

AnalysedTVShow is structured as follows:

export interface AnalysedTVShow {
  type: 'tv';
  name: string;
  season?: number;
  episodes?: number[];
  year?: number;
}

name will be the guessed at name, based on common patterns from backup programs, season and episodes will be the guessed at season and episodes, multiple episodes encoded in the same file will come as multiple entries in episodes array, year will be the year of the first air date.

Usage

This library will work in browser or in node and depends on boilerplate code to read file content and file path.

Browser

Using FileWithPath (eg. from React-dropzone) getting search data is done as:

const [analysed, setAnalysed] = useState<AnalysedMedia[]>([]);
const onDrop = useCallback((files) => {
  setAnalysed(files.map((file) => analyseFilePath(file.path)));
}, []);

const {...} = useDropzone({onDrop, accept: 'video/*'});

...

Getting the Media Hash takes a promise based FileReader wrapper, like:

const HASH_CHUNK_SIZE = 65536; //64 * 1024 - MediaHash defined
const [hashes, setHashes] = useState<string[]>([]);

//Simple promise wrapper for FileReader
const readBlock = useCallback((file: File, block: number): Promise<string> => {
  return new Promise<string>((resolve, reject): void => {
    const reader = new FileReader();
    reader.onload = (event) => {
      if (event.target !== null) {
        resolve(event.target.result as string);
      } else {
        reject(event);
      }
    };
    reader.onerror = (error) => {
      reject(error);
    }

    if (block < 0) {
        reader.readAsBinaryString(file.slice(block));
    } else {
        reader.readAsBinaryString(file.slice(0, block));
    }
  });
});

const onDrop = useCallback((files) => {
  //makeHash uses file size and the first and last 64K chunk of the file.
  Promise.all(files.map((file) => makeHash(file.size, readBlock(file, HASH_CHUNK_SIZE), readBlock(file, -HASH_CHUNK_SIZE)))
    .then(setHashes)
}, []);

const {...} = useDropzone({onDrop, accept: 'video/*'});

...

Aside from these functions there are a couple of helper functions included:

isAnalysedMovie and isAnalysedTVShow identity wrappers, and isExtras and isSample will tell if the file is likely a sample or an extra, finally isSameRelease will tell if two instances of AnalysedMedia are likely the same movie or TVShow (excluding season and episode from comparison). These methods can help cut down on the number of requests to APIs by grouping requests for TVShows and only getting individual episodes as needed, and not requesting subtitles for sample files, etc.

API Functions

API functions to search TMDB, OMDB and OpenSubtitles. The functions are seperated into mapper functions that map AnalysedMedia into queries for these APIs, and search, get and find methods that return Axios compatible request config objects.

Added mapper functions to map Tmdb and Omdb results into a subset of data called Media as a union of Movie and TVShow types:

export interface MediaInfo {
  plot?: string;
  images: Record<string, string>;
}

export interface Movie {
  type: 'movie';
  imdbId?: string;
  tmdbId?: number;
  title: string;
  release?: string;
  mediaInfo: MediaInfo;
}

export interface TVShow {
  type: 'tv';
  imdbId?: string;
  tmdbId?: number;
  name: string;
  firstAirDate?: string;
  mediaInfo: MediaInfo;
}

To use these AnalysedMedia objects can be mapped to search queries and used with the search functions to create an Axios compatible request object, like this:

Axios.request(searchTmdb(mapTmdbQuery(media), this.apiKey))
    .then((response) => response.data)

This will return a TmdbSearchResponse object of the following kind:

export interface TmdbSearchResponse {
    page: number;
    total_results: number;
    total_pages: number;
    results: Array<TmdbMovieResult|TmdbTVShowResult>;
}

The release and firstAirDate fields are formatted as yyyy-mm-dd. There is a specialised mergeMedia function that takes two of these and return the combined result. This can be used to fetch both from Tmdb and Omdb and merge into a single object.

Future development

Further methods would include looking for Meta-data inside files as some backup programs put useful information there, including looking for already existing subtitles before inquiring https://opensubtitles.org

Published

Vulnerabilities

Links

Maintainers

Keywords