Building Out the SeatGeek Data Pipeline

dpods13 · on Jan 23, 2015

We've been using Redshift + Looker a lot lately at my company and it's been great. We've also added DynamoDb to our pipeline so that we're about to move data from DynamoDb -> Redshift -> Looker for running reports and analytics on NoSQL data

willcodeforfoo · on Jan 23, 2015

I've been interested in these database query frontend services/apps for ahwile (e.g., Looker, periscope, chartio) but every single one of them has only a Free Trial and no mention of pricing anywhere on their sites–-which makes me think its a multi-thousand dollar/month investment.

Is this true? If so, is there anything a little more cost-effective for running, visualizing, and saving database queries?

dpods13 · on Jan 23, 2015

Yes, I believe Looker requires an annual contract at multi-thousand dollars a month. Not sure about the other products you listed and I don't know much about cheaper alternatives

hglaser · on Jan 23, 2015

Periscope co-founder here. We have plans as low as $350/mo. for startups with small datasets. I'm harry at periscope.io if that sounds appealing.

maslam · on Jan 23, 2015

This is similar to our[1] pipeline. We love Luigi for what it brings to the table for building ETL pipelines.

[1] Appuri (www.appuri.com)

pythondan · on Jan 23, 2015

Do most people use Redshift for this sort of thing? Is that the best option out there?

ajones · on Jan 23, 2015

By this sort of thing do you mean using it as a data warehouse? That's what Redshift is branded as and the performance gains are definitely similar to what the blog outlines.

In my opinion, Redshift is the best data warehouse solution for a team building a small to medium-sized warehouse. This covers most use cases. For those building a data warehouse above a petabyte in size, you're going to have to look at different solutions. Redshift is powerful and relatively cheap compared to its competitors.