We LOVE REDASH at our company. Lack of adequate developers (the neverending startup saga) meant not everyone got what they wanted, fast enough. Redash has changed that for us. We use it extensively across all our teams (operations, sales, marketing etc). Thanks!
Quick question: to what decimal accuracy are the visualizations capable of displaying, and are second/sub-second time series possible in the dashboards? The issue our team has been having with other platforms such as Quicksight is the lack of support for extremely granular data visualization. Thanks in advance!
In the tables we show only 2 digits after the dot, but that's easy to change (in code; and later on will be configurable). Never tried to plot sub-second time series, so don't know what will happen - but unlike most platforms, you can modify Redash to support this :)
It's true that it's extra work, but it's a one time investment which you can reuse for all your visualization needs later on.
I saw a mention on the front page about using custom scripts as data sources, but when I dig down into the docs about how to set those up, there's no mention of it.
Is there a blurb somewhere about how that data must be formatted and used?
I tried Redash some time ago while I was looking for a very simple "BI" like tool that our non-technical or low-technical colleagues could use (people in Client Satisfaction team).
Redash was a close second (one advantage at that time was the ability to have users that could only read queries), but after fighting with it for a while in order to install it, I tested Metabase and haven't looked back since.
Having a single executable is definitely the best, and I wish we had one for Readsh... trying to compensate on this with the Docker and cloud (AWS / GCE) images for easy setup.
Kind of... I mean, we use it very heavily in my current company. And we are always looking forward for a new release.
What you cannot do right now (v0.22.0) is let some people see cards but not create "new questions". I know the current release being developed is tackling something akin to "published dashboards", so I am very happy.
Btw, great work! you cannot imagine how happy Metabase made both our Devs (we didn't have the bandwidth to program dashboards or setup something more demanding) and our non-techs who were super excited to learn SQL and now have some impressive dashboards in Metabase.
> What you cannot do right now (v0.22.0) is let some people see cards but not create "new questions".
You definitely can do this with "Collections" (added in v0.20.0 I believe). You need to make sure to revoke all the "Data Access" and "SQL Queries" permissions for all groups the user belongs to (including the "All Users" group), then give their group "View" permission to the collection you want to allow them to view cards in.
(Now that I think about it, I'm not sure why we don't have a similar permission for cards that aren't in a collection.)
Looks great! What about open-sourced Superset[1] from AirBnB? Both looks great, but Superset seems to have a superset of Redash features (bad pun intended).
We used both, and ended up going with Superset for internal dashboards and data exploration. Redash is more polished to look at, but Superset offers more functionality. However, some things are still rough around the edges; I get occasional 500's during normal navigation, and the filtering options are immature, using only "IN" or "NOT IN" (you have to say things like "field not in FALSE", etc.)
Still, though, quite powerful, and we hope to push some improvements upstream in relation to my comments here.
Metabase is my new favourite tool: I've probably use it every day to answer some question instead of dropping into the Rails console. Things like: what percentage of orders come from Stripe vs Paypal? Or export a list of users that match this criteria. Or display the top 10 users by some metric.
It makes creating BI charts from a database easy enough that a non-developer can do it. But when the going gets tough, I can still drop down to hand-coded SQL. It was also an easy installation on Heroku.
It's not entirely perfect (combining multiple series on one chart is harder than it needs to be; can only pull data from databases not APIs) but it's been a success so far
Hey I work on Metabase, thanks for the kind words and feedback.
Regarding pulling data from APIs, there's nothing inherently stopping us (or you) from pulling data from APIs (in fact that's what the Google Analytics driver does). Are you looking for integrations with specific 3rd party APIs, or an easier way to integrate with your own APIs?
Feel free to respond here, email me, or file a GitHub issue. Thanks.
Redash is fantastic. Things like scheduled queries just work, and the AMI is a great way to get up and running quickly. The docs are decent and the upgrade script works well. I also really appreciated the (optional) Login with Google feature and the ability to limit it to certain email domains (we use Google Apps, so it worked really well). We've been trialling it casually in our engineering and data science teams.
I was really hoping to address this in the upcoming v1.0.0 release, but eventually didn't want to delay it as it was in the oven for too long anyway. It will probably be addressed in the release after.
Beside the ability to delete a user have you experienced other rough edges in the recent releases? We strive to improve the stability and reduce the rough edges with every release, so really want to hear about your experience.
We noticed a couple of UI irregularities (for example when editing charts the contents or axis labels would disappear). Unfortunately I don't have anything more specific for you right now. I did notice when deleting users it seems that they (and their dependent db objects) actually deleted from the database (rather than flagged as deleted) - for traceability it would be great if they weren't removed altogether.
Though to balance things, let me tell you what we loved about Redash. The user/group/source model is really nice - I appreciated being able to give only our admin group users access to a data source configured with our database master user credentials. The Google OAUTH functionality (and really clear documentation around this) and pre-rolled AMI were a huge plus, allowing me to get our team up and running really quickly. The same goes for the Lets Encrypt instructions (we all use certbot regularly but no-one had to dig through Redash's config files blind - the docs were spot on). The ability to fork queries - superb. Data export was also fantastic.
We use Redash via (https://redash.io/) for a truly hideous amount of stuff at our startup. We use it for almost every operations dashboard, and we've also prototyped out a task allocation and field agent management tool using them and Zapier.
They're really great, I'd highly recommend them to anyone.
Thanks <3. Always enjoy seeing what Eli does with Redash & Zapier. Really looking forward for when the Zapier integration becomes official and goes into their directory.
We've been using it on a project to aggregate clinical trial data from many different sources (https://opentrials.net) and it has been great!
It allows researchers (not necessarily devs, usually medical doctors) to peek our raw data, and it's a great excuse for them to learn at least the basics of SQL. The response has been great.
We also use it to do some small data checks on the data quality, with alerts sent to our Slack.
We have been using Redash at our company for almost a year now. Every single release just proves how promising the project is. You can make useful dashboards in minutes. Support for multiple databases is amazing. We are using it with multiple PostgreSQLs, Redshift, MongoDB and InfluxDB.
The most valuable feature is alerts though. I work at an ecommerce and operations heavy company where we have tons of connected components. Where alerts come really handy for us is that anyone in the organization can add quick alerts for proactive monitoring of events recorded in on-field ops and act when things go bad. This almost like building a feature on one of the internal tools, just doing that yourself without any engineering support. This comes in really handy.
Kudos to the team! Looking forward to some more amazing stuff in Redash!
I'm a marketing consultant and Redash has worked its way into my preferred stack for analytics: Segment + Redshift + Redash.
It replaces event- and user-tracking tools like Mixpanel, Heap, Woopra, and others. Those are just UI layers on top of SQL queries. If you or your marketers know (or can learn) even basic SQL then Redash is what you want.
I really like Redash, it’s one of the early tools that introduce this concept of turn SQL into chart to developers, and also teach developers to learn and write better SQL, altogether without any cost. I evaluated Redash during my past company back in 2013 (we were also using Tableau), but due to some Redash’s lack of features (no support for filters, lack of permission control, sporadic performance), we went and build something inhouse with similar approach (turn SQL into charts).
And inspired by the same path Redash founder took, that internal project turned into a startup by itself.
We’re relatively new but getting good momentum. Some of our customers went with us after evaluating both. While we don’t have a self-hosted open-source version, our pricing only starts at 49$/mo for up to 5 users (pretty affordable for startups IMHO).
Hello! Redash is great if you are looking to host it in-house and have the engineering resources to set-up and maintain it.
If you are looking for an affordable SaaS alternative, you may want to look into Holistics (www.holistics.io). (Note I'm the other co-founder here with huy)
Besides supporting native SQL, Holistics is designed to address the gaps of SQL for common business reporting use-cases (flexible way of passing user inputs as parameters into report query; supporting if-else capabilities in SQL, reusable query templates and records/columns based access control for users/user groups).
This makes it easier to manage and reduce the management of duplicate SQL query syntax across multiple reports/dashboards (especially multiple UNIONS/Case-Ifs statements). We also have our own DSL for you to configure in more details how certain charts should look like (beyond the normal coloring).
A common problem we also see is that most data-related work is not just visualization, mainly for the reason that most data are not structured/formatted in the right table structure as most companies start off without a data warehouse.
While most query tools requires customers to work with a separate ETL/warehousing tool, we’ve built an integrated approach towards data reporting and data preparation. Insights from our data reporting module (reports with expensive joins, long query times, non-optimal table structures) provides your data analyst inputs to easily move, map records, and transform data without technical engineering knowledge. Data in Google Sheets or CSVs with their data sync automatically (incrementally or full) to their database.
And all this is done with the data not leaving our customers’ database (we don’t warehouse their data. Our data reporting module works directly with your database, and our data preparation module provides just the utilities (not infrastructure) to automate your data pipeline process.
Do take some time to check us out! Quote HN and we give you an additional one day of free trial! :P
> Hello! Redash is great if you are looking to host it in-house and have the engineering resources to set-up and maintain it.
If you are looking for an affordable SaaS alternative...
Well, actually there is a hosted SaaS version of Redash too (https://redash.io). This is what sponsors the work on Redash.
For those of you looking for a another fully hosted solution, check out https://www.cluvio.com (full disclosure: I am one of the founders).
Like redash, Cluvio allows you to run SQL queries against your database and quickly visualize results as beautiful, interactive dashboards, which can easily be shared within your company or externally.
We developed our own custom grammar on top of SQL which makes writing time-range related queries a lot easier and allows to parametrize queries, which powers the dashboard interactivity.
We also allow to run custom R script on top of the SQL results, have SQL Alerts that run at specified schedules, allow you to create SQL Snippets and offer a free entry plan.
Currently supported datasources are Postgres, Redshift, MySQL, MariaDB and Amazon Aurora.
Hey Maarius, thanks for chiming in! I've had my eye on Cluvio for a while so I was gonna drop it in here to ask how the two compare.
For me, an open source core is a huge gesture of trust though. It's the ultimate "no lock-in strategy" guarantee. This business model has been proven sustainable by several SAAS companies, such as Automattic (WordPress.com), Discourse (which I work for), Sentry, Piwik and hopefully Redash well!
Any chance Cluvio will consider going down this path too?
At the moment (and I hope to keep it this way), it's not only the core of Redash that is open source -- it's the whole product. There are a few features that are introduced on the hosted version first, for ease of development. But most of it goes to the open source first.
As we (Redash) use Discourse as well, it will be only fitting, to have Discourse use Redash :-)
Good question!
Our aim is to build the best cloud BI platform for SMEs and we currently don't plan to open source Cluvio. However, we do offer a "free forever" plan that allows you to use our platform for free, indefinitely.
Hello, it seems that there are no instructions on step-by-step installing redash in your own server. The only thing I could find was info on how to install it on AWS or using docker (https://redash.io/help-onpremise/setup/setting-up-redash-ins...). However I want to install it normally, using my server's nginx / postgresql / python / supervisorctl etc. Can it be done? Can you provide some step by step instructions ?
Step by step instructions could be great, but there is that much time in the day, so figured that anyone who wants to customize the deployment, will know how do it on his own (or use the bootstrap script for reference).
I'll be happy to answer questions on how to set it up in your environment, you're welcome to the forum[1], Gitter[2] or Slack[3].
Thanks! Yes this file could be used to see the compoments needed to install redash.
However I'd really prefer some complete step-by-step instructions to actually understand better what's going on. For example similar to the instructions gitlab provides for installing in your server (https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/inst...)
We're hosting on our own EC2 instance using the docker image. Setup was mostly smooth. We love the integration with Google Apps so our entire organization automatically has access.
We primarily use it to connect to Amazon's Redshift and pull some data out for quick visual analysis. It's a very good combo. We're at the early stages of using it but it seems like a very solid product.
Not exactly. Say I have a table with one row per user and want users for a given country. I could say
SELECT country_code AS country_code::multi-filter,
count(1) AS users FROM mytable GROUP BY country_code;
And I will indeed get a dropdown with country codes. But, the query is still running for all countries (and if I'm not mistaken all the filtering is on the client). For something more complex, like number of users from a country with spend over a certain amount, I want something like:
SELECT country_code AS "cc::filter",
COUNT(1) as num_users
FROM mytable
WHERE lifetime_spend > {Value from a superset dropdown}
GROUP BY country_code
I could do something like
SELECT country_code AS "cc::filter",
COUNT(1) as num_users,
lifetime_spend from .....
and then filter on the Redash side, I suppose, but that will be slower and have to deal with a lot of data I don't care about.
Has something like this been added in the last few months?
Also, Redash generally seemed to be based around the idea that you would write SQL with a few parameters in the SELECT statement and users would be content with that (which of course is true for many use cases!) We wanted to be able to add a table to Redash/Superset/Metabase/whatever and have someone who doesn't know SQL gain some insights quickly. For instance, distinct users in the last week on a given platform from a set of access logs.
Still though, I meant what I said - I had Redash up and hooked to our DB with useful charts in about 5 minutes, which was spectacular. As time went on though we realized that a lot of people could answer their questions in Superset without asking someone else to write SQL, while Redash required someone to write a query.
Superset is definitely the less mature and polished product, though, perhaps due to its ambition!
The "::filter" convention is filters. We also have parameters (for some time now, although only in v0.11 added UI for them). With parameters you can do:
SELECT country_code AS "cc::filter", COUNT(1) as num_users FROM mytable WHERE lifetime_spend > {{minimum_lifetime_spend}} GROUP BY country_code
And minimum_lifetime_spend will render as an input box. Currently we only support input boxes (of different types - number/string/date), but there is a plan to add support for dropdowns there as well.
Allowing self serving without knowing SQL is the goal, but I believe it will take time to do it right. We focus on delivering a great product for people who know SQL, and allow them to give more interactive result sets to other users using things like parameters.
So not sure if our goal is less ambitious, but we just take a different path. And that's good I guess, it would've been not interesting if everyone built the same thing, the same way :-)
"A web-based notebook that enables interactive data analytics.
You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more."
Are some public datasets available directly as SQL endpoints?
I know the Semantic Web provides some public datasets as SPARQL endpoints.
But I have never heard of an equivalent for SQL.
I was working for a company that relied heavily on Wagon - and then it got bought by Box and they shutdown the product. We looked for alternatives but nothing seemed to offer what Wagon had - this looks like it might be a good fit though. Excited to try it out.
You should also check out DataMill (http://www.datamillapp.com), it's quite similar to what Wagon offered. Installed as an app, context-aware SQL autocomplete and built-in charting. It's in public beta - give it a try :-)
I think part of the reason this happens is Java has wider support out of the box for most data sources via JDBC than Windows has with ODBC/ADO.NET. I mean, it's a close race, but most new open source databases have Java/JDBC drivers first and then ODBC second, for example. I think this may change a little with .NET being opened and cross platform, but it might take a while.
Of course you can make Java/JDBC connections from Windows to databases but many users somehow find it more difficult and I've noticed an annoying trend among large enterprises running Windows on the desktop to not want to support Java/JDBC tools on the desktop, unless they are invested in it.
I'm not unstanding why you mentioned Java/JDBC, I'm not aware of any of the mentioned tools (redash,superset,blazer; turns out metabase uses Clojure) using Java. Only redash doesn't list SQL Server support. What am I missing?
Well, I meant in the general context of database connectivity. The underlying database connectivity plumbing is almost always a JDBC or ODBC interface when you're talking a regular DBMS [1].
Historically, ODBC was not always as popular on Linux (being a Microsoft standard originally), although that has changed a lot by today. JDBC was usually more associated to open source/Linux BI/database tools (Java being open source in origin itself), which is why for a while, you could connect to more data sources with JDBC drivers/interfaces than ODBC. Writing a JDBC driver is also many times less complicated than writing an ODBC driver. The ODBC spec is old and complicated.. JDBC is still difficult but much less complicated.
Most programming languages can get to either a JDBC or ODBC driver via some mechanism, on pretty much any platform. So today I guess it's hard to say that it matters (ideally it shouldn't), I was just giving my personal take on the historical context that lead to the comment you made - "why doesn't this run in the Microsoft stack?".
All of the tools mentioned (redash, superset, blazer) are relying on the underlying interfaces in ODBC / JDBC for regular databases. If the system is running on Unix and accessing ODBC, it is almost guaranteed to be using UnixODBC to be doing so. Any tool like this (I don't know the internal-specifics of these tools) would probably have built a layer in to abstract away the low-level interface into some mechanism that hopefully makes the use of JDBC or ODBC (or any interface) irrelevant. That's a lot harder than it seems on the face of it. I used to work for a company that made a product that connected / federated data from "any" platform - which is partly what influences my opinion of how difficult it is to wrangle all these interfaces at lower-levels.
EDIT: I think the crux of what I'm saying is, the reason you don't have a tool that pops up when you ask that question is, because of OS platform differences, with Linux/open source usually being the first priority over Windows historically.
[1] Some DBMS provide native web services connectivity which obviates the need for JDBC/ODBC completely.. which is kind of nice.
For Superset we use SQLAlchemy and much of the connectivity goes through the DBAPI Python abstraction, and I believe most of the drivers are using native implementations.
And there you have it, "turtles all the way down". SQLAlchemy sitting on DBAPI, where you can get to databases primarily through: 1) ODBC/JDBC, 2) ADO or 3) native Python database drivers contributed by the community that speak the wire protocol of the database (usually wire protocols that are open source or openly documented).
These abstractions most programming languages have make the problem mostly go away for developers.
However, sometimes you'll find (as is sometimes the case with SQL Server, for example) that the fastest or most complete/stable database driver is a specific one that is not quite 100% (but could be 99.9995% supported) completely supported. For example, Python gets to SQL Server via ADO (only on Windows) or some driver (usually ODBC) that talks TDS protocol - often FreeTDS (open wire protocol compatible with SQL Server).
We are definitely in a better state today with regards to programming language/framework support for interfaces to databases, but I think the historical context of the original "generic database interface standards" (JDBC/ODBC) is important to understand to know how we got to where we are, and why sometimes people struggle to ask "what BI tool is the first to come to mind on Microsoft/Windows stack"? Don't get me wrong, plenty of Microsoft/Windows supported BI tools exist, and cross platform abstractions like those in Python make it so OS matters much less (but still matters).
I know it's late but I suppose for open source traditional BI tools one might answer something like Pentaho, JasperReports, Actuate/BIRT. Note those are all Java for reasons of being cross platform, so again not native to Windows.
There are a slew of other open source BI related tools people run on Windows, such as Talend for ETL or things like R or Octave for data science.
Windows desktop proprietary BI tools - Tableau is super popular as one example of a handful.
Redash is great. We've tried several alternatives, including Metabase, and ended up with extensive Redash use. We use the self-hosted version for sharing data between the tech and business sides, for day-to-day monitoring and for ad-hoc visualizations, and are generally very happy with it.
It's not perfect - there are some areas, like fine-control over chart customization, query auto-updates (especially queries with parameters) and the ability to queue parallel queries to the same data source which would be nice improvements, but these are nitpicks - highly recommended.
That's definitely possible. You might need to tune the number of workers you have to run queries (and maybe adjust your database settings, if it has limits on concurrency).
If you need help with this, you're welcome to the forum[1], Gitter[2] or Slack[3].
The main issue for us was support of JOINs (note that this was a while ago, not sure what's the status now; I did find this relatively recent discussion, though [1])
Yes, essentially. An export to a file on a given time period would be fine, can deal with the format conversion etc myself. I'm thinking of sending arbitrary reports to management without the need for Jasper etc.
Also, publish data set as a web service dataset, and now you're into the features offered by "data virtualization" software vendors / tools (things like Teiid or Denodo, for example).
We are currently working on solving this exact same problem, developer friendly report automation sans visualization. We are entirely focused on solving the reporting problem before tacking visualisation etc. If you are interested in trying out our product, please send me an email at abhyrama at google email service and I will send you the instructions to try it out. The product is out and is being piloted in two large startups which have seen tremendous efficiency improvements in their whole reporting process. We are yet to bring the product public website up.
We started using redash about 2 years back. We use it for ad-hoc SQL to Redshift and MySQL and most of our business level KPI dashboards are there. One thing I would want to do is add some metabase like segmentation there(preferably that works with joins), so that marketing/sales people can also actively query from it.
My qualms with Tableau is that they've moved excruciating slowly on MacOS support and that getting started from scratch seemed like a major development effort. We evaluated Tableau against Looker and are now very happy customers of the latter for a year+.
You will need to allow access from our servers to your database. It's obviously a security concern, but we take great measures to ensure the safety of your data.
If that's not an option, you can always run Redash self hosted.
I'll be happy to answer any questions about Redash, open source and about my journey to make Redash a self sustainable project.
For any questions that don't fit on HN, feel free to reach me at arik at redash.io.