Universal Reddit Scraper - A comprehensive Reddit scraping/archival command-line tool.

832
108
Python
 __  __  _ __   ____
/\ \/\ \/\`'__\/',__\
\ \ \_\ \ \ \//\__, `\
 \ \____/\ \_\\/\____/
  \/___/  \/_/ \/___/

Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.

GitHub Workflow Status (Python)
GitHub Workflow Status (Rust)
Codecov
GitHub release (latest by date)
Total lines
License

[-h]
[-e]
[-v]

[-t [<optional_date>]]
[--check]

[-r <subreddit> <(h|n|c|t|r|s)> <n_results_or_keywords> [<optional_time_filter>]]
    [-y]
    [--csv]
    [--rules]
[-u <redditor> <n_results>]
[-c <submission_url> <n_results>]
    [--raw]
[-b]
    [--csv]

[-lr <subreddit>]
[-lu <redditor>]

    [--nosave]
    [--stream-submissions]

[-f <file_path>]
    [--csv]
[-wc <file_path> [<optional_export_format>]]
    [--nosave]

Table of Contents

Contact

Whether you are using URS for enterprise or personal use, I am very interested in hearing about your use case and how it has helped you achieve a goal. Additionally, please send me an email if you would like to contribute, have questions, or want to share something you have built on top of it.

You can send me an email by clicking on the badge. I look forward to hearing from you!

ProtonMail

Introduction

This is a comprehensive Reddit scraping tool that integrates multiple features:

  • Scrape Reddit via PRAW (the official Python Reddit API Wrapper)
    • Scrape Subreddits
    • Scrape Redditors
    • Scrape submission comments
  • Livestream Reddit via PRAW
    • Livestream comments submitted within Subreddits or by Redditors
    • Livestream submissions submitted within Subreddits or by Redditors
  • Analytical tools for scraped data
    • Generate frequencies for words that are found in submission titles, bodies, and/or comments
    • Generate a wordcloud from scrape results

“Where’s the Manual?”

URS Manual

This README has become too long to comfortably contain all usage information for this tool. Consequently, the information that used to be in this file has been moved to a separate manual created with mdBook, a Rust command-line tool for creating books from Markdown files.

Note: You can also find the link in the About sidebar in this repository.

Demo GIFs

Here are all the demo GIFs recorded for URS.

Note: The nd command is nomad, a modern tree alternative I wrote in Rust.

Subreddit Scraping

subreddit demo

Redditor Scraping

redditor demo

Submission Comments Scraping

submission comments demo

Livestreaming Reddit

livestream subreddit demo

Generating Word Frequencies

frequencies demo

Generating Wordclouds

wordcloud demo

Checking PRAW Rate Limits

check praw rate limits demo

Displaying Directory Tree

display directory tree demo

Sponsors

This is a shout-out section for my patrons - thank you so much for sponsoring this project!