npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

pixiv-crawler

v0.7.2

Published

> pixiv-crawler is a reptile for a website which named pixiv

Downloads

16

Readme

pixiv-crawler

pixiv-crawler is a reptile for a website which named pixiv

Features

  • 获取数据前要登录pixiv,可以使用 --set-cookie 保存 PHPSESSID,具体的值去浏览器查看
crawlP --set-cookie 'vfiy123_18237qde'
# or
crawlU --set-cookie 'vfiy123_18237qde'

如果 PHPSESSID 更新了,记得更新保存的 PHPSESSID

  • 输入illust_id,爬取一张图片单张源图片
crawlP -i 67844926
  • 输入url,爬取一张图片单张源图片
crawlP -u 67844926
  • 指定输出路径
crawlP -i 67844926 -o '~/pixiv-imgs'
  • 未指定输出文件夹时

    • 在运行命令的目录创建文件夹,名称中加入日期,如果是爬取作者页面的内容则在最后加入作者名称;
    • 日期格式: 2018-04-08
    • 文件夹名称: 日期 pixiv (i.e. "2018-04-08 pixiv")
  • 指定文件名,{fn}代表图片的源文件名

crawlP -i 67844926 -n 'sometext{fn}sometext'
  • 根据用户id分析用户的作品

    • 文件夹命名格式:(日期 pixiv 作者名称)
    • 爬取用户的作品或者收藏时提供以下可选项
    • 1:增加起始页和结束页设置
    • 2:增加爬取的图片个数个数限制
    • 3:可以设置只爬取某一页
    • 优先级:3>2>1
  • 根据用户id获取所有作品

crawlU -i 3869665
  • 根据url获取所有作品
crawlU -u 'https://www.pixiv.net/member.php?id=3869665'
  • 根据用户id获取作品,限制数量为12张
crawlU -i 3869665 -c 12
  • 根据用户id获取用户的所有公开的收藏 (获取id为3869665的用户的所有收藏
crawlU -i 3869665 -t 'bookmark'
  • 根据用户id获取指定的某一页的图片(作品或收藏) (获取id为3869665的用户的第二页作品
crawlU -i 3869665 -p 2
  • 根据用户id获取从指定的页数开始的所有图片 (获取id为3869665的用户的第二页开始的作品
crawlU -i 3869665 -s 2
  • 根据用户id获取到指定的页数为止的所有图片 (获取id为3869665的用户的第1页到第5页的作品
crawlU -i 3869665 -f 5
  • 指定输出路径
crawlU -i 3869665 -o '~/pixiv-imgs'
  • 未指定输出文件夹时

    • 在运行命令的目录创建文件夹,名称中加入日期,如果是爬取作者页面的内容则在最后加入作者名称;
    • 日期格式: 2018-04-08
    • 文件夹名称: 日期 pixiv 作者 (i.e. "2018-04-08 pixiv xxx")
  • 指定文件名,{fn}代表图片的源文件名

crawlU -i 3869665 -n 'sometext{fn}sometext'

Todos

  • 分析特辑的图片数据
  • 抓取图片页面的推荐图片数据