npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

csdn-crawler

v1.0.8

Published

一个专门用于爬取csdn文章的爬虫/A JS library for Crawl CSDN Article.

Downloads

5

Readme

A Nodejs Crawler for crawling user's article from csdn.com.

Only for Node.js Application, not work on browser.

  • Offer options.username will return you the user's article list(default length is 5);
  • Upload the Article's image to your own Qiniu Cloud Server when you offer the config: options.qiniu<object>
  • Offer options.page, options.size can limit the page and size config for api

为什么写这个? / Why would I code this?

  • I want some data to fill my database for big-data's test, but it seems hard to me to write it myself(because I am so lazy).

  • May be so many coder face the same things like me. So, let me make this job become easier.

  • WARN: This repo is only for test and study, do not use this to run Pressure-Test on csdn.com.

使用指南 / Usages

1、Fill you own config

// Example:
const options = {
    username: 'weixin_45534242', // target username
    page: 1, // the page index you are crawling
    size: 5, // page size
    link: '', // the user center article list api, you can find it on csdn.com using: F12
    businessType: 'blog', // crawl article type. only support 'blog' now.
    sleepTime: null, // Unit is: ms. sleep time when you crawling the data, it may save your ip from blocking.
    supportImageType: ['jpg', 'png', 'jpeg', 'webp', 'gif', 'mp4', 'bmp', 'svg'], // support uplaod image
    imagePrefixName: 'crawl-', // upload image name prefix
    contentNodeIdentify: '#article_content', // the html id name in article node
    qiniu: {
        zone: '', // Your qiniu cloud zone
        scope: '', // Your qiniu scope name. Storage name.
        useHttpsDomain: true, // like what you see. this is https setting
        useCdnDomain: true, // config your cdn domain, it use on Article List Image
        baseQiNiuCdnApi: '', // you CDN domain name
        remoteFilePath: '/openStatic', // the folder path where you want to save img
        isNeedWaterMark: false, // if `true`, you will need to offer qiniu image style name, write it below:
        imageStyleSplitQuote: '&', // the quote you use in image src link like: https://qiniu.com/asd.png&scale-my-img
        imageStyleName: '', // your qiniu style name
        accessKey: '', // Qiniu cloud accessKey
        secretKey: '', // Qiniu secretKey
        imageBaseAlt: '' // image base alt message prefix
    }
}

2、开始使用csdnCrawler / Fly your code now.

// You can find this code on `./demo.js`
const csdnCrawler = require('./index')
const exampleOptions = {
    username: 'weixin_45534242',
    page: 1,
    size: 5,
    link: '',
    businessType: 'blog',
    sleepTime: null, // Unit is: ms
    supportImageType: ['jpg', 'png', 'jpeg', 'webp', 'gif', 'mp4', 'bmp', 'svg'],
    imagePrefixName: 'crawl-',
    contentNodeIdentify: '#article_content',
    qiniu: {}
}

csdnCrawler(exampleOptions, data => {
    console.log(data)
    console.log(`==============================`)
    console.log(`===  Demo Crawl Succeed !!!===`)
    console.log(`==============================`)
    console.log(`Total Data length : ${data.length}`)
})

再次警告 / FBI WARN AGAIN( to save me from trouble)

  • Don't use this for bad purpose.
  • It may cause something bad result in CN(Maybe break the law...) and will drive you crazy.
  • Plz only use this for testing and study purpose.