Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You can also use the Pushshift real-time feed in BigQuery to query for keywords in submissions in real time (unfortunately the comments feed broke last month)

Example query which searches for 'f5bot' in the past day and correctly finds the corresponding posts on Reddit:

   #standardSQL
   SELECT title, subreddit, permalink
   FROM `pushshift.rt_reddit.submissions`
   WHERE created_utc > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
   AND REGEXP_CONTAINS(LOWER(title), r'f5bot')


There has been a lot of interest expressed in getting this working and dependable. It's part of my plan when releasing the new API. There is A LOT of internal code managing everything. I've got terabytes of indexes alone just to handle the 5 million API requests I'm currently getting each month to the Pushshift API (I have around 20 terabytes of SSD / NVMe space and around 512 GB of ram behind this project).




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: