A Reddit research tool built around citation, not scraping.
If you’ve ever cited a Reddit thread in a piece of research and gone back six months later to find the URL works but the comment thread feels different, you know the problem. Public threads change: comments get edited, deleted, removed by mods, archived by Reddit. A research tool has to capture the snapshot.
Reddit is increasingly the place where the actual conversation about a product, a policy, or a cultural moment happens — and increasingly cited as a primary source. But the affordances of academic citation and the affordances of a live web platform don’t line up. Threads mutate. This tool is built for people who need them to stop.
The problem with citing Reddit threads
A Reddit thread is not a stable artifact. Edits happen silently. Deletions remove the body but leave the structural shell. Moderators remove comments. Reddit archives threads after six months, freezing them — but only after most of the change has already happened. If your research depends on a thread being what it was on the day you read it, you need a snapshot that captures everything: text, score, timestamp, author, position in the tree.
The standard “export to CSV” flow most tools offer captures the text. The serious ones capture the metadata. Almost none capture provenance — the linkage back to a specific comment id and permalink that lets you, or a reviewer, verify what you cited.
What provenance means in practice
Every row in our export carries:
- id — Reddit’s permanent comment identifier. Unique forever.
- permalink — full URL to the specific comment, not just the thread.
reddit.com/r/X/comments/Y/.../Zresolves to that exact comment. - parentId + depth — the comment’s position in the reply tree.
- createdAt — ISO-8601 timestamp from Reddit, not your extraction time.
- score — captured at the moment of extraction. Reddit’s score will continue to change after you leave; your snapshot does not.
- extractionSource + extractedAt — were these from the JSON API or DOM, and at what wall-clock time on your machine.
That last pair is the part most tools omit. For a methods section in a paper, or an editor’s note in a journalism piece, you need to be able to say: “Comments retrieved 2026-04-12 14:33 UTC; permalinks verified.” The export carries the timestamp explicitly so you don’t have to reconstruct it later.
Reproducibility on a public, mutable source
You cannot make a public Reddit thread immutable, but you can do three things that approximate reproducibility:
- Capture the snapshot the day you read it. The score from extraction time, frozen in the export, is what you cite — not the live score.
- Carry the permalink for every comment. A reader can resolve any single citation back to its specific position. If the comment has been deleted in the meantime, the permalink resolves to
[deleted], which is itself useful information. - Preserve the tree, not just the text. A comment quoted out of context loses meaning. Hierarchy preservation lets you cite a chain — “in reply to X, who replied to Y” — accurately.
Research ethics on a public-but-personal platform
Reddit is public, but it’s not anonymous in the sense that an individual username can be linked to a person’s posting history across years. The standard practice for ethical research on Reddit threads:
- Don’t paraphrase usernames into bylines. If you quote, quote the username only when relevant; otherwise paraphrase or anonymize.
- Treat throwaway accounts as deliberately anonymous; respect that.
- For published work, consider whether quoting verbatim could expose someone — even on a public thread — to consequences they didn’t consent to.
- Subreddit norms vary. Quoting from r/AskHistorians is different from quoting from a small support subreddit.
The export gives you the data you need to make these decisions yourself. It doesn’t make them for you.
Three research workflows we’ve seen
Discourse analysis
Pull every comment in a thread (or several related threads) into CSV. Open in NVivo, MAXQDA, or just Sheets. Code the responses. Export the codebook back out. The depth column lets you weight by reply position; the score column lets you weight by community endorsement.
Longitudinal sampling
Pull the same subreddit’s “top of week” thread every week for six months. The History tab (Pro) saves these locally so you can compare over time without re-extracting. Combined timestamps give you a clean panel.
Thematic content analysis
Use the Insights tab inside the extension to surface top words, common phrases, and frequently asked questions across the thread. It’s a starting point — not a substitute for coding — but it cuts the cold-reading time in half on threads with hundreds of comments.
Formats that play well with research tools
CSV opens in everything. JSON is what you want for structured analysis or programmatic comparison across threads. Both formats include the same columns; JSON additionally preserves the nested reply structure as an array.
For qualitative tools that import from Word or Docs (Atlas.ti, NVivo classic), the Plus tier exports to a Google Doc with formatting preserved — easier to import than CSV.
How to cite a Reddit thread
The minimal acceptable citation includes the permalink, retrieval date, and (where possible) the comment id of any specifically quoted comment. APA-flavored example:
u/founderdiary. (2026, April 12). [Comment on the post "How did
you get your first 100 users"]. r/SaaS.
https://reddit.com/r/SaaS/comments/1h2x.../how_did_you_get_your_first_100_users/k8s2j1m
Retrieved 2026-04-12. Every export contains these fields per row, so building a bibliography is mechanical.
Keep reading
How-to
Export Reddit comments to CSV
The cleanest way to get a Reddit thread into a spreadsheet — with hierarchy intact.
After 2023
A Reddit API alternative for the rest of us
You don’t need a developer account, an OAuth dance, or $0.24 per 1k calls to read public comments.
For marketers
Find ad hooks inside Reddit threads
Turn the words your customers actually use into ad angles, headlines, and positioning copy.
Stop copying comments by hand
Install once. Export forever.
A free Chrome extension built for one platform. Add it on the next thread you open.