Hacker News new | past | comments | ask | show | jobs | submit login

Ask HN: Is there any open source BI tool that is always first on the list when the conversation finally gets to

"Oh, you're on the Windows/Microsoft stack? Then you're probably going to wind up using _______"




I think part of the reason this happens is Java has wider support out of the box for most data sources via JDBC than Windows has with ODBC/ADO.NET. I mean, it's a close race, but most new open source databases have Java/JDBC drivers first and then ODBC second, for example. I think this may change a little with .NET being opened and cross platform, but it might take a while.

Of course you can make Java/JDBC connections from Windows to databases but many users somehow find it more difficult and I've noticed an annoying trend among large enterprises running Windows on the desktop to not want to support Java/JDBC tools on the desktop, unless they are invested in it.


Thanks for taking the time to respond!

I'm not unstanding why you mentioned Java/JDBC, I'm not aware of any of the mentioned tools (redash,superset,blazer; turns out metabase uses Clojure) using Java. Only redash doesn't list SQL Server support. What am I missing?


Well, I meant in the general context of database connectivity. The underlying database connectivity plumbing is almost always a JDBC or ODBC interface when you're talking a regular DBMS [1].

Historically, ODBC was not always as popular on Linux (being a Microsoft standard originally), although that has changed a lot by today. JDBC was usually more associated to open source/Linux BI/database tools (Java being open source in origin itself), which is why for a while, you could connect to more data sources with JDBC drivers/interfaces than ODBC. Writing a JDBC driver is also many times less complicated than writing an ODBC driver. The ODBC spec is old and complicated.. JDBC is still difficult but much less complicated.

Most programming languages can get to either a JDBC or ODBC driver via some mechanism, on pretty much any platform. So today I guess it's hard to say that it matters (ideally it shouldn't), I was just giving my personal take on the historical context that lead to the comment you made - "why doesn't this run in the Microsoft stack?".

All of the tools mentioned (redash, superset, blazer) are relying on the underlying interfaces in ODBC / JDBC for regular databases. If the system is running on Unix and accessing ODBC, it is almost guaranteed to be using UnixODBC to be doing so. Any tool like this (I don't know the internal-specifics of these tools) would probably have built a layer in to abstract away the low-level interface into some mechanism that hopefully makes the use of JDBC or ODBC (or any interface) irrelevant. That's a lot harder than it seems on the face of it. I used to work for a company that made a product that connected / federated data from "any" platform - which is partly what influences my opinion of how difficult it is to wrangle all these interfaces at lower-levels.

EDIT: I think the crux of what I'm saying is, the reason you don't have a tool that pops up when you ask that question is, because of OS platform differences, with Linux/open source usually being the first priority over Windows historically.

[1] Some DBMS provide native web services connectivity which obviates the need for JDBC/ODBC completely.. which is kind of nice.


For Superset we use SQLAlchemy and much of the connectivity goes through the DBAPI Python abstraction, and I believe most of the drivers are using native implementations.


And there you have it, "turtles all the way down". SQLAlchemy sitting on DBAPI, where you can get to databases primarily through: 1) ODBC/JDBC, 2) ADO or 3) native Python database drivers contributed by the community that speak the wire protocol of the database (usually wire protocols that are open source or openly documented).

These abstractions most programming languages have make the problem mostly go away for developers.

However, sometimes you'll find (as is sometimes the case with SQL Server, for example) that the fastest or most complete/stable database driver is a specific one that is not quite 100% (but could be 99.9995% supported) completely supported. For example, Python gets to SQL Server via ADO (only on Windows) or some driver (usually ODBC) that talks TDS protocol - often FreeTDS (open wire protocol compatible with SQL Server).

We are definitely in a better state today with regards to programming language/framework support for interfaces to databases, but I think the historical context of the original "generic database interface standards" (JDBC/ODBC) is important to understand to know how we got to where we are, and why sometimes people struggle to ask "what BI tool is the first to come to mind on Microsoft/Windows stack"? Don't get me wrong, plenty of Microsoft/Windows supported BI tools exist, and cross platform abstractions like those in Python make it so OS matters much less (but still matters).


Have you tried using the growing support for docker and Linux running under windows? I haven't, personally. Just curious if others have.


I know it's late but I suppose for open source traditional BI tools one might answer something like Pentaho, JasperReports, Actuate/BIRT. Note those are all Java for reasons of being cross platform, so again not native to Windows.

There are a slew of other open source BI related tools people run on Windows, such as Talend for ETL or things like R or Octave for data science.

Windows desktop proprietary BI tools - Tableau is super popular as one example of a handful.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: