urllib-f
v1.0.0
Published
Help in opening URLs (mostly HTTP) in a complex world — basic and digest authentication, redirections, cookies and more.
Downloads
33
Maintainers
Readme
urllib
Request HTTP URLs in a complex world — basic and digest authentication, redirections, cookies, timeout and more.
Install
$ npm install urllib --save
Usage
callback
var urllib = require('urllib');
urllib.request('http://cnodejs.org/', function (err, data, res) {
if (err) {
throw err; // you need to handle error
}
console.log(res.statusCode);
console.log(res.headers);
// data is Buffer instance
console.log(data.toString());
});
Promise
If you've installed bluebird,
bluebird will be used.
urllib
does not install bluebird for you.
Otherwise, if you're using a node that has native v8 Promises (v0.11.13+), then that will be used.
Otherwise, this library will crash the process and exit, so you might as well install bluebird as a dependency!
var urllib = require('urllib');
urllib.request('http://nodejs.org').then(function (result) {
// result: {data: buffer, res: response object}
console.log('status: %s, body size: %d, headers: %j', result.res.statusCode, result.data.length, result.res.headers);
}).catch(function (err) {
console.error(err);
});
co & generator
var co = require('co');
var urllib = require('urllib');
co(function* () {
var result = yield urllib.requestThunk('http://nodejs.org');
console.log('status: %s, body size: %d, headers: %j',
result.status, result.data.length, result.headers);
})();
Global response
event
You should create a urllib instance first.
var httpclient = require('urllib').create();
httpclient.on('response', function (info) {
error: err,
ctx: args.ctx,
req: {
url: url,
options: options,
size: requestSize,
},
res: res
});
httpclient.request('http://nodejs.org', function (err, body) {
console.log('body size: %d', body.length);
});
API Doc
Method: http.request(url[, options][, callback])
Arguments
- url String | Object - The URL to request, either a String or a Object that return by url.parse.
- options Object - Optional
- method String - Request method, defaults to
GET
. Could beGET
,POST
,DELETE
orPUT
. Alias 'type'. - data Object - Data to be sent. Will be stringify automatically.
- dataAsQueryString Boolean - Force convert
data
to query string. - content String | Buffer - Manually set the content of payload. If set,
data
will be ignored. - stream stream.Readable - Stream to be pipe to the remote. If set,
data
andcontent
will be ignored. - writeStream stream.Writable - A writable stream to be piped by the response stream. Responding data will be write to this stream and
callback
will be called withdata
setnull
after finished writing. - consumeWriteStream [true] - consume the writeStream, invoke the callback after writeStream close.
- contentType String - Type of request data. Could be
json
. If it'sjson
, will auto setContent-Type: application/json
header. - nestedQuerystring Boolean - urllib default use querystring to stringify form data which don't support nested object, will use qs instead of querystring to support nested object by set this option to true.
- dataType String - Type of response data. Could be
text
orjson
. If it'stext
, thecallback
eddata
would be a String. If it'sjson
, thedata
of callback would be a parsed JSON Object and will auto setAccept: application/json
header. Defaultcallback
eddata
would be aBuffer
. - fixJSONCtlChars Boolean - Fix the control characters (U+0000 through U+001F) before JSON parse response. Default is
false
. - headers Object - Request headers.
- timeout Number | Array - Request timeout in milliseconds for connecting phase and response receiving phase. Defaults to
exports.TIMEOUT
, both are 5s. You can usetimeout: 5000
to tell urllib use same timeout on two phase or set them seperately such astimeout: [3000, 5000]
, which will set connecting timeout to 3s and response 5s. - auth String -
username:password
used in HTTP Basic Authorization. - digestAuth String -
username:password
used in HTTP Digest Authorization. - agent http.Agent - HTTP Agent object.
Set
false
if you does not use agent. - httpsAgent https.Agent - HTTPS Agent object.
Set
false
if you does not use agent. - ca String | Buffer | Array - An array of strings or Buffers of trusted certificates. If this is omitted several well known "root" CAs will be used, like VeriSign. These are used to authorize connections. Notes: This is necessary only if the server uses the self-signed certificate
- rejectUnauthorized Boolean - If true, the server certificate is verified against the list of supplied CAs. An 'error' event is emitted if verification fails. Default: true.
- pfx String | Buffer - A string or Buffer containing the private key, certificate and CA certs of the server in PFX or PKCS12 format.
- key String | Buffer - A string or Buffer containing the private key of the client in PEM format. Notes: This is necessary only if using the client certificate authentication
- cert String | Buffer - A string or Buffer containing the certificate key of the client in PEM format. Notes: This is necessary only if using the client certificate authentication
- passphrase String - A string of passphrase for the private key or pfx.
- ciphers String - A string describing the ciphers to use or exclude.
- secureProtocol String - The SSL method to use, e.g. SSLv3_method to force SSL version 3.
- followRedirect Boolean - follow HTTP 3xx responses as redirects. defaults to false.
- maxRedirects Number - The maximum number of redirects to follow, defaults to 10.
- formatRedirectUrl Function - Format the redirect url by your self. Default is
url.resolve(from, to)
. - beforeRequest Function - Before request hook, you can change every thing here.
- streaming Boolean - let you get the
res
object when request connected, defaultfalse
. aliascustomResponse
- gzip Boolean - Accept gzip response content and auto decode it, default is
false
. - timing Boolean - Enable timing or not, default is
false
. - enableProxy Boolean - Enable proxy request, default is
false
. - proxy String | Object - proxy agent uri or options, default is
null
. - lookup Function - Custom DNS lookup function, default is
dns.lookup
. Require node >= 4.0.0(for http protocol) and node >=8(for https protocol) - checkAddress Function: optional, check request address to protect from SSRF and similar attacks. It receive tow arguments(
ip
andfamily
) and should return true or false to identified the address is legal or not. It rely onlookup
and have the same version requirement. - trace Boolean - Enable capture stack include call site of library entrance, default is
false
.
- method String - Request method, defaults to
- callback(err, data, res) Function - Optional callback.
- err Error - Would be
null
if no error accured. - data Buffer | Object - The data responsed. Would be a Buffer if
dataType
is set totext
or an JSON parsed into Object if it's set tojson
. - res http.IncomingMessage - The response.
- err Error - Would be
Returns
http.ClientRequest - The request.
Calling .abort()
method of the request stream can cancel the request.
Options: options.data
When making a request:
urllib.request('http://example.com', {
method: 'GET',
data: {
'a': 'hello',
'b': 'world'
}
});
For GET
request, data
will be stringify to query string, e.g. http://example.com/?a=hello&b=world
.
For others like POST
, PATCH
or PUT
request,
in defaults, the data
will be stringify into application/x-www-form-urlencoded
format
if Content-Type
header is not set.
If Content-type
is application/json
, the data
will be JSON.stringify
to JSON data format.
Options: options.content
options.content
is useful when you wish to construct the request body by yourself,
for example making a Content-Type: application/json
request.
Notes that if you want to send a JSON body, you should stringify it yourself:
urllib.request('http://example.com', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
content: JSON.stringify({
a: 'hello',
b: 'world'
})
});
It would make a HTTP request like:
POST / HTTP/1.1
Host: example.com
Content-Type: application/json
{
"a": "hello",
"b": "world"
}
This exmaple can use options.data
with application/json
content type:
urllib.request('http://example.com', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
data: {
a: 'hello',
b: 'world'
}
});
Options: options.stream
Uploads a file with formstream:
var urllib = require('urllib');
var formstream = require('formstream');
var form = formstream();
form.file('file', __filename);
form.field('hello', '你好urllib');
var req = urllib.request('http://my.server.com/upload', {
method: 'POST',
headers: form.headers(),
stream: form
}, function (err, data, res) {
// upload finished
});
Response Object
Response is normal object, it contains:
status
orstatusCode
: response status code.-1
meaning some network error likeENOTFOUND
-2
meaning ConnectionTimeoutError
statusMessage
: response status message.headers
: response http headers, default is{}
size
: response sizeaborted
: response was aborted or notrt
: total request and response time in ms.timing
: timing object if timing enable.remoteAddress
: http server ip addressremotePort
: http server ip portsocketHandledRequests
: socket already handled request countsocketHandledResponses
: socket already handled response count
Response: res.aborted
If the underlaying connection was terminated before response.end()
was called,
res.aborted
should be true
.
require('http').createServer(function (req, res) {
req.resume();
req.on('end', function () {
res.write('foo haha\n');
setTimeout(function () {
res.write('foo haha 2');
setTimeout(function () {
res.socket.end();
}, 300);
}, 200);
return;
});
}).listen(1984);
urllib.request('http://127.0.0.1:1984/socket.end', function (err, data, res) {
data.toString().should.equal('foo haha\nfoo haha 2');
should.ok(res.aborted);
done();
});
HttpClient2
HttpClient2 is a new instance for future. request method only return a promise, compatible with async/await
and generator in co.
Options
options extends from urllib, besides below
- retry Number - a retry count, when get an error, it will request again until reach the retry count.
- retryDelay Number - wait a delay(ms) between retries.
- isRetry Function - determine whether retry, a response object as the first argument. it will retry when status >= 500 by default. Request error is not included.
Proxy
Support both http
and https
protocol.
Notice: Only support on Node.js >= 4.0.0
Programming
urllib.request('https://twitter.com/', {
enableProxy: true,
proxy: 'http://localhost:8008',
}, (err, data, res) => {
console.log(res.status, res.headers);
});
System environment variable
- http
HTTP_PROXY=http://localhost:8008
http_proxy=http://localhost:8008
- https
HTTP_PROXY=http://localhost:8008
http_proxy=http://localhost:8008
HTTPS_PROXY=https://localhost:8008
https_proxy=https://localhost:8008
$ http_proxy=http://localhost:8008 node index.js
Trace
If set trace true, error stack will contains full call stack, like
Error: connect ECONNREFUSED 127.0.0.1:11
at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1113:14)
--------------------
at ~/workspace/urllib/lib/urllib.js:150:13
at new Promise (<anonymous>)
at Object.request (~/workspace/urllib/lib/urllib.js:149:10)
at Context.<anonymous> (~/workspace/urllib/test/urllib_promise.test.js:49:19)
....
When open the trace, urllib may have poor perfomance, please consider carefully.
TODO
- [ ] Support component
- [ ] Browser env use Ajax
- [√] Support Proxy
- [√] Upload file like form upload
- [√] Auto redirect handle
- [√] https & self-signed certificate
- [√] Connection timeout & Response timeout
- [√] Support
Accept-Encoding=gzip
byoptions.gzip = true
- [√] Support Digest access authentication