@mangos/filepath
v0.0.8
Published
filepath parses UNC (long and short and cmd aliases) local filesystem path, and unix paths, checks for invalid path characters
Downloads
23
Maintainers
Readme
filepath
This is a filepath parsing (LL(1) parser) and manipulation tool. It returns a parse tree describing the file path.
- joins lexed path names
- infer the most likely OS file type(s) (plural) based on file name only.
- validates path strings, (checks for forbidden characters.. etc)for the various os filetypes
FilePath tool complements the nodejs path
module, parsing the following path types.
Part of the monorepo mangos
Support the work by starring this repo on github.
It handles the following paths types:
| path type | description |
| ------------ | -------------------------------------------------------------------------------------------- |
| unc
| microsoft unc filepath |
| dos
| traditional dos path (tdp) path |
| devicePath
| dos device path (ddp), alos allowing for dos devicepath descibing UNC //./UNC/Server/Share
|
| posix
| posix path |
Works in browser and in node.
npm install @mangos/filepath
filepath
module has 3 named exports.
const { inferPathType, lexPath, resolve } = require('@mangos/filepath');
// rest of your code
| function | description |
| --------------- | ------------------------------------------------------------------------------------- |
| inferPathType
| guess the os file type based on the path string purely, multiple matches are possible |
| lexPath
| lexer for path string, returns token array representing the path value |
| resolve
| akin to nodejs path.resolve
, respecting unc
, unc_long
and device path
roots |
inferPathType(path[, options])
path
string File pathoptions
Object- unc: boolean default will be set the the value of
platform === 'win32'
. If true, interperet the path as a unc pathname, if this is not possibe,lexPath
returns undefined. - dos: boolean default will be set to the value of
platform === 'win32'
. If true,interperet as a TDP (Traditional Dos Path), if not possible,lexPath
returns undefined. - devicePath: boolean default will be set to value of
platform === 'win32'
. If true, interperet as DDP (Dos Device Path). - posix: boolean default will be set to value of
platform !== 'win32'
. If true,interpret a UNIX devivce path.
- unc: boolean default will be set the the value of
- Returns: iterator < PathObject > an Iterator returning valid interpretations (plural) of the
path
the most likely file types first.
const { inferPathType } = require('@mangos/filepath');
const iterator = inferPathType('\\\\?\\unc\\c:/Users'); // Note: in JS you need to escape backslashes \\
let value, done;
{ value, done } = iterator.next(); // most likely path type
//-> done = undefined.
//-> value =
/*
{
type: "devicePath",
path: [
{
token: '\u0005', // token for the root element of a "devicePath"
value: '\\\\?\\UNC\\c:\\Users', //-> normalized path
start: 0,
end: 15
}
]
}
*/
{ value, done } = iterator.next(); // less likely type path
// -> next possible interpretation for the string
lexPath([path[,options]])
LexPath
chooses the most likely (even if there are more interpertations of the path
arguments) path type interpretation.
path
string File path.options
Object- unc: boolean default will be set the the value of
platform === 'win32'
. If true, interperet the path as a unc pathname, if this is not possibe,lexPath
returns undefined. - dos: boolean default will be set to the value of
platform === 'win32'
. If true,interperet as a TDP (Traditional Dos Path), if not possible,lexPath
returns undefined. - devicePath: boolean default will be set to value of
platform === 'win32'
. If true, interperet as DDP (Dos Device Path). - posix: boolean default will be set to value of
platform !== 'win32'
. If true,interpret a UNIX devivce path.
- unc: boolean default will be set the the value of
- Returns: single Object of type PathObject.
Example 1:
const { lexPath } = require('@mangos/filepath');
const result = lexPath('c:/hello/world'); // the function is agnostic to '\' or '/' tokens
// ->
/*
{ path:
[ { token: '\u0003', value: 'c:', start: 0, end: 1 },
{ token: '\u0001', start: 2, end: 2, value: '\\' },
{ token: '\u0006', start: 3, end: 7, value: 'hello' },
{ token: '\u0001', start: 8, end: 8, value: '\\' },
{ token: '\u0006', start: 9, end: 13, value: 'world' } ],
type: 'dos'
*/
Example 2:
const { lexPath } = require('@mangos/filepath');
const result = lexPath('//Server1/share/file.txt'); // the function is agnostic to '\' or '/' tokens
// ->
/*
{
type: 'unc',
path: [
{ token: '\u0004', value: '\\\\Server1\\share', start: 0, end: 14 },
{ token: '\u0001', start: 15, end: 15, value: '\\' },
{ token: '\u0006', start: 16, end: 23, value: 'file.txt' }
]
}
*/
resolve([...paths])
Resolve will work exactly like path.resolve
but with these difference: It will respect the devicePath
roots including the Server
and share
parts aswell. aka //./unc/server/share
will seen as a root in totality.
...paths
string A sequence of file path or paths segments, the sequence can be empty (returns current working directory)- Returns: Object of PathObject.
Example 1:
const { resolve } = require('@mangos/filepath');
const result = resolve('//./unc/Server1/share1/dir1/file.txt','../../../../hello/world');
//->
/*
{ path:
[ { token: '\u0005',
value: '\\\\?\\UNC\\Server1\\share1',
start: 0,
end: 21 },
{ token: '\u0001', start: 22, end: 22, value: '\\' },
{ token: '\u0006', start: 23, end: 39, value: 'hello' },
{ token: '\u0001', start: 40, end: 40, value: '\\' },
{ token: '\u0006', start: 41, end: 63, value: 'world' } ],
type: 'devicePath' }
*/
Example 2:
const { resolve, lextPath } = require('@mangos/filepath');
const posixPath = lexPath('/home/user1', {posix: true});
const result = resolve('//./Server1/share1/',posixPath); // the last asbolute Path defines resulting pathType
//->
/*
{ path:
[ { token: '\u0002', start: 0, end: 0, value: '/' },
{ token: '\u0006', start: 1, end: 4, value: 'home' },
{ token: '\u0001', start: 5, end: 5, value: '/' },
{ token: '\u0006', start: 6, end: 10, value: 'user1' } ],
type: 'posix' }
*/
Example 3:
const { resolve, lextPath } = require('@mangos/filepath');
const posix = lexPath('/home/user1', {posix: true});
const dos= lexPath('c:/Program Files/app');
const result = resolve(dos,posi); // the last asbolute Path defines resulting pathType
//->
/*
{ path:
[ { token: '\u0002', start: 0, end: 0, value: '/' },
{ token: '\u0006', start: 1, end: 4, value: 'home' },
{ token: '\u0001', start: 5, end: 5, value: '/' },
{ token: '\u0006', start: 6, end: 10, value: 'user1' } ],
type: 'posix' }
*/
Example 4:
const { resolve } = require('@mangos/filepath');
// current working directory is "/home/user1" (on a posix filesystem)
const result = resolve('h1','h2');
//->
/*
{ path:
[{ token: '\u0002', start: 0, end: 0, value: '/' },
{ token: '\u0006', start: 1, end: 4, value: 'home' },
{ token: '\u0001', start: 5, end: 5, value: '/' },
{ token: '\u0006', start: 6, end: 10, value: 'user1' },
{ token: '\u0001', start: 11, end: 11, value: '/' },
{ token: '\u0006', start: 12, end: 13, value: 'h1' },
{ token: '\u0001', start: 14, end: 14, value: '/' },
{ token: '\u0006', start: 15, end: 16, value: 'h2' }
],
type: 'posix' }
*/
Types
Token-ID
The path lexer produces pieces of the string filepath as tokens, this is a list of all the lexer tokens
| token | value (token id) | descriptions | example |
| ---------- | ---------------- | --------------------------------------------- | -------------------------------------------------- |
| SEP | \u0001
| filepath seperator | /
or \
|
| POSIX_ROOT | \u0002
| a posix root /
at the beginning of a path | |
| TDP_ROOT | \u0003
| root of a traditional dos path | c:
|
| UNC_ROOT | \u0004
| root token for a UNC root | //server1/share1
or \\server|\share1
|
| DDP_ROOT | \u0005
| dos device root | \\?\unc\server1\share1
or \\.\\c:
or \\.\COM
|
| PATHELT | \u0006
| directory/file name between to SEP
| |
| PARENT | \u0007
| a PATHELT representing a PARENT directory | ..
|
| CURRENT | \u0008
| a PATHELT representing the current director i | .
|
Token
This Token is the result of a lexer slicing and dicing a the a string representing a path
interface Token {
value: string; // a sanatized value of this token
start: number; // the index of original string, indicating the start of the token
end: number; // the index of original string, indicating the end (inclusive) of the token
error: string; // it this token contains errors (like forbidden charactes in dos paths)
token: string; // single character `\u0001` between `\u0008`
}
For the token
values, see this list
PathObject
- The function inferPathType returns an iterator of PathObject
- The function lexPath returns a single instance of
PathObject
interface PathObject {
type: 'posix'|'unc'|'dos'|'devicePath',
path: Token[];
firstError?: string; //-> first error encounterd in the token array (from left to right)
}
Feedback
We appreceate any feedback, with new ideas, to enhance this tool suite. File an issue here
Before contributing, please read our contributing guidelines and code of conduct.
License
Copyright (c) 2019-2020 Jacob Bogers [email protected]
.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.