Hacker News new | past | comments | ask | show | jobs | submit login
SchemaCrawler: Free database schema discovery and comprehension tool (schemacrawler.com)
143 points by based2 on Dec 30, 2018 | hide | past | favorite | 20 comments



For those interested in this space, AWS Glue Crawler does schema discovery on databases as well as data stored in S3: https://aws.amazon.com/glue/

Disclosure: I work on AWS Glue


Curious, does AWS Glue or SchemaCrawler do any type inference past basic data types? Such as string analysis to automatically mark fields as a shipping tracking number, IP address, ISO country/city, correct date format, etc?


We use this to impute the schema. Has anyone thought about taking this creating something like an open-source TAMR ? Would love to hear ideas around it if someone has.


TAMR is a neat tool, but what is the price? I’m wary of tools that require a demo to even understand if this is a $1M tool or $100M tool.

I’m interested in sustainable ways to map out data across the enterprise. But the vendor space is hard, for me, to analyze at a green fields level because it’s full of pretty heavy tools that require implementation and consulting just to set up. I was unaware of TAMR until your post but tried to go through Gartner’s analysis looking at data management platforms.

Are there even any open source tools or communities in this space. For example, I started looking at Talend’s metadata mgmt oss stack for eval as to whether it’s something that would help me, but gave up after their demos wouldn’t run in a few environments I tried.


Do not know the cost point of TAMR. Aware that it is expensive. There are a couple of tools that come to mind, metacat (as someone pointed out), wherehows (linkedin), apache atlas (not sure who contributed). The issue is also to look at not just RDBMS but also RDF, Graph and then abstract it through a semantic layer.


I've never used TAMR, but how about https://github.com/Netflix/metacat?


I have not used it, I have heard of it. Have you used it, would love to get your ideas / thoughts on it.


What is TAMR?


https://www.tamr.com/ - Metadata management for the enterprise as I understand it.


Their sells team must be good to manage to get a french bank with this name: tamr would be read as "ta mère" (your mom) which is usually used alone only for cursing.


I would love to contribute


I'm currently using: http://schemaspy.org/ and I also used its predecessor.

There are certainly warts, and they seem to be Oracle first, but on the whole I get a reasonable documentation experience out of it.


I also use Schemaspy. Two things I like about Schemaspy 1) Recognises markdown in your database comment field. 2) you can merge documentation in a text file with metadata from your database. It's a neat way of keeping database documentation.


This is a great tool. We use it to generate an Entity Relationship Diagram from our canonical DDL file checked into our repo.

Here's the basic recipe:

  1. Spin up a fresh Postgres instance on Docker using -P to claim an available ephemeral TCP port
  2. Use `docker inspect` to read the Postgres port
  3. Run DDL script on the fresh instance
  4. Run SchemaCrawler Docker container using --network host option so it can connect to Postgres
     and using -v so it can save a schema image to the host filesystem
This entire process is a `/bin` script checked into our repo, so we can update `/doc/db-schema.png` any time. It takes about 15s total since we have to pause for the Postgres instance to come online.


I'd also used wwwsqldesigner[0] (possibly a different fork) with some custom hacks to infer relationships by naming where foreign keys were not present. It produced a quick ERD for getting started on a project. Always wanted a more complete (non-PHP) version of this tool and perhaps there is one in these comments.

[0] https://github.com/ondras/wwwsqldesigner


This tool is fantastic. In a previous life, I used it to dynamically analyze and extract users from a multi-tenant database and determine the proper sort order for reinsertion in a different database on a potentially different (JDBC-compatible) platform.


What can this tool do? Download it and run -h to find out.

One can only wonder why any javascript library of the week has better docs.


Quite a bit of detail in terms of command line options, etc, under Features menu. Could be better organized, but I don't think it's as bad as you suggested.


How is this different from a web crawler?


Similarity really is only the name. Web crawlers scrape web pages and follow links to find additional pages to scrape. A tool like this inspects your database and determines your schema, relationships, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: