If you use this, also take a look at GraphQL, AWS has a hosted GraphQL service called Appsync, and if you look into self hosted you can also use Prisma.
It's highly customisable, works directly with postgresql row levels security and the performance is quite good. It has a custom GraphiQL gui to work on queries/mutations.
To really see how it all works together checkout the starter project: https://github.com/graphile/starter it has migrations, job queue, graphql-codegen etc.
tried GraphQL, not sure I'm sold on it viability. Sure it is very uniformed, however for nested queries it is very slow. We already have SQL don't need another querying language in js. If I really want get fancy then prolog would be much more preferred. On top of that FB is backing the project make me feel very uneasy, the same exact reason why I won't want to use React with a 10 foot pole.
(disclaimer: not feeling totally authoritative on this, have not used graphql in production).
i feel that the choice between postgrest & postgraphile / another graphql solution revolves around whether you're a front-end dev who doesn't get to arbitrarily change/expose data schema. if this is true, and you are collaborating with other devs on the code then the added flexibility in querying probably outweighs the inferred complexity of this complex data abstraction.
if, however, you control both back and front-end, graphql isn't really needed, as you can expose whatever views you fancy by means of SQL (via views, table/row permissions, rpc functions exposed by postgrest, etc)
PostgREST v9.0.0 was released last night. A few neat features include better support for Partitioned Tables, better handling for Postgres functions with unnamed parameters, and a (much requested) addition for LEFT JOIN support.
A year ago the Russian-based Lingualeo (20 mln users) presented an article where they described their new architecture which has a similar concept. They have a simple proxy server in Go and all the logic happens in Postgres, i.e. frontend directly manipulates DB. Their claimed advantages were superior performance and less boilerplate/faster iteration times.
In a matter of hours, commenters were able to dump sensitive data (passwords and all) by constructing specially crafted queries. Another problem they had was that many core devs left because they perceived the new architecture a step backwards. Many also noted that they don't have adequate tools for version control anymore. I don't know how PostgREST compares, maybe it doesn't have such issues.
> In a matter of hours, commenters were able to dump sensitive data (passwords and all) by constructing specially crafted queries.
To be fair, SQL injection attacks have been a common problem for a long time, and most websites did not expose their database directly like this.
It certainly feels like PostgREST and systems like it are dancing close to the fire here, since there's a lot less in between you and an attacker. But as far as I can tell, for me, right now, it's just that: a feeling.
I'd love to have a robust analysis of the access controls Postgres provides, and more importantly, how many mistakes (relative and absolute) need to be made for a breach. I'd like to know whether the security story here is fundamentally brittle, or whether it's practically viable and just needs better support for doing the "right" thing.
Personally, I like the architectural philosophy a lot. We saw a project on HN yesterday [0] that takes a similar approach. I'm oversimplifying, but I like that instead of trying to figure out what series of API calls will put the system in the state you want, you can simply put it in the state you want. It's easier for me to look at a schema/type and understand what it's supposed to mean than to look at a piece of code and figure out what it's supposed to do.
> It certainly feels like PostgREST and systems like it are dancing close to the fire here, since there's a lot less in between you and an attacker.
There are definitely fewer buffered copies of your responses, yes.
This is one advantage few people know about PostgREST, once the database produces even a single row of data, it starts to stream JSON to the frontend. Users don't have to wait for the response to fully buffer in memory in some Python or Go process before being relayed on.
As for security, Postgres supports Row Level Security. You can literally protect every single row of your data with a solid security policy using a security mechanism older and tested longer than the average age of your typical HN reader.
This security mechanism doesn't just apply to your web frontend either, but to all SQL clients regardless of the frontend language. No other framework has such a deep integration with Postres' native security mechanism.
> As for security, Postgres supports Row Level Security. You can literally protect every single row of your data with a solid security policy using a security mechanism older and tested longer than the average age of your typical HN reader.
Maybe you're referring to the underlying concept rather than the implementation, but wasn't row-level security added to Postgres in 2016?
> This is one advantage few people know about PostgREST, once the database produces even a single row of data, it starts to stream JSON to the frontend. Users don't have to wait for the response to fully buffer in memory in some Python or Go process before being relayed on.
This is one big reason why I like this kind of architecture, though I'd generalize it a bit. More than streaming, you don't have to reimplement access control mechanisms (as noted), subscription mechanisms (AWAIT/NOTIFY), or batching/transactions in your API layer. So much of what you do in an API server is just proxying for functionality that already exists in the DB, and that alone is a reason to seriously consider this approach.
> Maybe you're referring to the underlying concept rather than the implementation, but wasn't row-level security added to Postgres in 2016?
Yes you are correct my statement was too vague, I was referring to Postgres' role based security in general at that point. RLS specifically was added in 9.5. Before that permission checking was table/column based, so you could still protect data on a per-role basis, just not as fine a grain.
Another subtle difference is that RLS hides rows you can't see, where as a REVOKE on a table or column would cause an error if you tried to access it. In either case unauthorized users cannot see protected data.
If the Twitter thread is accurate, their API received a list of DB field (and their JSON names to use) from the client. It was hard to guess names of fields, so someone decided to fuzz it a little bit and the API endpoint returned all existing fields in the database.
Of course, it was a bug in their code, which they promptly fixed. But it's the only case of a company with a large userbase using this kind of architecture that I know of, and their experience was far from smooth, so I thought I'd share.
My point was your anecdote has nothing to do with PostgREST. It's a foggy hand wave that could be laid over just about anything with an HTTP socket on one end and a database on the other.
I agree it's just an anecdote and I have nothing against PostgREST. I think it's an interesting approach which has its uses. Currently at my job all I've been doing lately has been replacing slow ORM-based access with pure SQL queries for performance/readability and I can see the advantages of ditching the layer between the frontend and the DB entirely, at least for queries.
But there are several things I'd like to see an analysis/comparison of, between the traditional approach and PostgREST's approach:
1) what is the cost of a mistake (what if I misconfigure row-based policy vs. misconfigure in the traditional backend)
2) the size of the attack surface (if we expose DB directly, as opposed to having an additional layer)
3) how many developers are proficient in Postgres enough to avoid #1 and #2 as opposed to traditional code-based security policies
> 1) what is the cost of a mistake (what if I misconfigure row-based policy vs. misconfigure in the traditional backend)
Good question. Ok let's do a thought experiment between PostgREST and Django. Every single Django production system I've seen logs into Postgres as a superuser (so it can do migrations).
PostgREST logs into the database as a user with NO privileges (because migration is a problem for a migration tool, not a web framework!) and then switches roles based on the (signed) JWT token. The PostgREST user is NEVER a superuser.
So you flub a Django permission and users can now run queries as a superuser and potentially leak everythinv. If you flub a PostgREST configuration, you leak data maybe in one or two tables.
> 2) the size of the attack surface (if we expose DB directly, as opposed to having an additional layer)
This is going to differ depending on who you ask. People talk about attack "surfaces" but I personally think of them as "attack paths". Does more code mean more paths? I think yes. Others think no.
> 3) how many developers are proficient in Postgres enough to avoid #1 and #2 as opposed to traditional code-based security policies
Postgres' security mechanism comes down to "who can see this" it's the same question those developers would be asking in Django. If they're not proficient enough to understand now to GRANT or REVOKE are they going to be proficient enough to understand the implications of their application logging into the database as a superuser?
> Every single Django production system I've seen logs into Postgres as a superuser (so it can do migrations).
then all those production systems do it wrong. all the django DB user needs (even for migrations) is to be the OWNER of the schema you give it (the NAME setting in DATABASES). how you create this database is up to you and it's not django's job.
there are 2 cases i can think of where django needs extra permissions:
1. to run the test suite django will want to create/destroy a `test_projectname` database. for this `CREATE USER ... CREATEDB ...` is sufficient, no superuser needed. this should not be in your production database anyway but a throwaway DB where you run your CI/CD.
2. when some extension needs to be installed (like pg_trgm, etc) you might need superuser, but again, it's debatable if this is django's job. some projects try to do this in their migrations, but i prefer to do this during provisioning. the postgres docs say:
> Loading an extension ordinarily requires the same privileges that would be required to create its component objects. For many extensions this means superuser privileges are needed. However, if the extension is marked trusted in its control file, then it can be installed by any user who has CREATE privilege on the current database.
Let's put aside that being the OWNER of a database object means that by default is granted all privileges to it, and that owners aren't (typically) filtered by Row Level Security.
Google "Django Postgres Tutorial" and the first link creates a superuser:
To a security conscious DBA, this is insanity. Yes, you can lock down a Django app just like PostgREST in fact, and add some code to do SET ROLE and all that just like PostgREST, but no one does because the documentation on database setup in the Django does doesn't even mention superusers:
let's not put it aside. i find it perfectly reasonable that the DB user django uses to connect to its own database has all privileges to that database.
there are ways to limit these privileges depending on the requirements both on django and postgres level, for example to have some read only users, etc.
don't google "Django Postgres Tutorial", the only django tutorial you need is the official django tutorial.
the djangocentral article discusses django superusers, not postgres ones. a django superuser has no postgres superuser privileges, they are 2 different things.
> Django does doesn't even mention superusers
of course it does not mention superusers because django does not recommend creating db superusers for security reasons and it mentions them only in connection with installing postgres extensions, and that is a limitation of postgres, not django.
in my professional opinion a run of the mill web application does not need row level security, setting up schema-wide privileges correctly is hard enough. but if one needs it, it's one google search away that some people have tried to do it with django as well, it's a postgres concept and as such can be implemented in any other framework connecting to it.
There's something to be said about layers of security providing redundancy to catch errors--that's part of what goes into highly reliable operations used in aerospace and health care. If you get the postgres wrong and that's all you're relying on, then that's all you've got. If you've got the same postgres wrapped in a more rigid validation then arguably there are more bugs that need to align for a mishap. Of course the flip side is that's even more complex and more difficult to implement. The reality is people probably tend to skimp on the postgres layer's security if they're burying it deep under other things. Whether you're focusing on the postgres layer or some higher later (python or whatever), that's still just one layer of protection.
We're currently developing a Supabase app that's backed by PostgREST.
For our MVP, we built out everything that we could in SQL procedures (they're honestly not that terrible, especially in Postgres, just a bit verbose), we could also use `pgv8` and write them in JavaScript. If we need to trigger some special logic, we push data through webhooks to another service that also has access to the database. It works, but it's kind of messy and I'm somewhat worried about the performance of the HTTP client bundled with Postgres. This mechanism could also be implemented with the `LISTEN`/`NOTIFY` interface.
It feels like the better way to do it is having an API gateway / a separate API accepting the same JWT requests as PostgREST and others. We'll try it for our next project and maybe we'll migrate this product too, if we run into any issues with HTP requests.
It's been great for CRUD-heavy apps. Optimizing the RLSes sometimes gets tough, but you can always work around them with views / functions that avoid them.
There are a number of options. You can publish the WAL replication stream and write downstream services that react off of that. I wrote a library in Haskell [0] that can read such a stream. Libraries exist in Python and other languages as well.
Or you can take advantage of the `LISTEN/NOTIFY` feature in Postgres and use an extension to push events out to various queues or webhooks [1].
You use an api-gateway in front of this and send those requests to your application server, generally though you should try to keep your business logic in triggers and stored procedures.
Writing all of my business logic in triggers and stored procedures seems like a high cost to avoid writing crud. All the value comes from business logic. Making the business logic easier to read and work with is usually my top priority.
Wether it makes sense to put business logic in procedures really depends on what's involved in "business logic".
For data-heavy apps with very complex schemas, your code can actually get more readable inside stored procedures. I used procedures for a content creation pipeline/headless CMS app that had flexible workflow templates (those also lived in the DB). It works quite well, and is simpler and more readable than most common backend code and much faster to develop.
If you need to trigger async stuff, like network access or data processing, one pattern I enjoy is writing to a log-ish table that gets picked up and processed later by a separate service. This is nice for emails and other types of notifications: your users get a log of sent emails, and you can very easily do it inside a transaction.
If the complexity is on the querying side, then you can get a lot of mileage from views. If you need caching for that, materialized views really help.
However: If you're doing complex numeric/algorithmic stuff, rely on external libraries, or has just plain too much business logic... then doing things in a separate service is definitely better, no question about it.
First of all, you usually don't put your business logic in your frontend. This makes your entire argument fall flat on its face.
If we then take your argument and instead just apply it to, say, using JS on the backend.. well, JS is not going anywhere any time soon.
pl/pgsql is absolutely awful and it is a disservice to other programming languages to call it a programming language. Writing actual, complicated business logic using this and maybe some mix of JS(because why not pull that into the databse?? great idea) is just going to make every single person who has to maintain this in the future hate your guts.
I'm truly glad I don't have to work with anyone who genuinely thinks such a design would be a good idea.
> First of all, you usually don't put your business logic in your frontend. This makes your entire argument fall flat on its face.
the comment makes sense if one interprets "frontend" as "all the shit in front of the database". which is a reasonable interpretation in the context of something like postgrest, if not perhaps satisfying to you.
> pl/pgsql is absolutely awful
who cares
> Writing actual, complicated business logic using this and maybe some mix of JS(because why not pull that into the databse?? great idea)
there are a number of other languages available for writing postgresql procedures, and you can add new ones
This is the reason why postgrest is mostly solution looking for a problem to solve. I.e. last time I had great usecase, I ran into the problem that I needed good integration with our identity-service and didn't want to do the workaround that the project suggests.
The issue I have with these automatic "backend from your DB" service is:
* For the simple case, they seem like they are as good as a simple backend service, albeit easier to set up.
* For more complex cases (e.g. external dependencies, complex business logic), they seem so much more difficult to make work that I can't imagine anyone _not_ setting up a separate service to handle those scenarios. Maybe you can do everything with stored procedures, but Hasura, Postgraphile, et al. all seem to suggest that you should stand up a separate service for this.
* Once you've got a separate service where your complexity lives, that's going to be the meat of your project.
* Is it really worth it to introduce a separate dependency & pattern just to scaffold the easy parts more quickly? The consistency of having a single data access pattern between the client and all parts of your service seems like it would outweigh not needing to write a little code to proxy queries through your backend by far.
I guess it really depends on the app - if it's really just a view layer, or if the RUD* has very little logic to it and it's expected that the complexity of the backend logic really won't grow over time (never seen this...) then sure, it could be a fit.
Until then it seems like yet another MVP optimization that gets thrown away as soon as the app needs to grow up.
> * For the simple case, they seem like they are as good as a simple backend service, albeit easier to set up.
Agreed.
> * For more complex cases (e.g. external dependencies, complex business logic), they seem so much more difficult to make work that I can't imagine anyone _not_ setting up a separate service to handle those scenarios. Maybe you can do everything with stored procedures, but Hasura, Postgraphile, et al. all seem to suggest that you should stand up a separate service for this.
You should stand up a separate service for this regardless of your choice of web server.
A web server should not be handling complex business logic cases, for many reasons but one obvious one is synchronous wait times. PostgREST or not, bigger job should go into a work queue, which should take just a few milliseconds.
The second reason a work queue is better is that it is more secure. Your web process can be given the absolute minimal privileges, say INSERT only on the queue table. The queue worker can have more privileges. The important upside is that user facing systems cannot take "complex business logic" actions by way of security enforcement.
> * Once you've got a separate service where your complexity lives, that's going to be the meat of your project.
I agree, but that will happen regardless of your web server.
> * Is it really worth it to introduce a separate dependency & pattern just to scaffold the easy parts more quickly?
Quickness is only one tiny advantage of PostgREST. Native security on database objects, streaming responses without buffering, using every available postgres feature that no other database supports without third-party plugins to your web framework, that's just the beginning, but yeah it's a hell of a lot quick and easier to use too.
> Postgraphile, et al. all seem to suggest that you should stand up a separate service for this.
PostGraphile maintainer here; with the exception of recommending job queues for work that can/should be completed asynchronously (which is a good idea no matter what server you use) I do not recommended setting up a separate service for this kind of thing. PostGraphile is highly extensible, you can implement most things in JS (or TypeScript) natively in Node.
I find the idea of PostgREST very intriguing at first sight.
Someone around with experience running this in production? Did you find any major pain points when integrating in a client app (let's say React, Svelte or alike)?
The trouble I felt in the past with the concept of DB as source of truth was to manage business logic (eq. in stored procedures) in a way, that works for a distributed team efficently.
It works pretty well with frontend frameworks, as long as you keep the interfaces simple and don't try to cheat and put business logic into the frontend.
IMO the the source of truth for Views and Stored Procedures should be a git repository, where developers change them in a .sql file, and the CI tests and updates them in the databases.
I tried it via Supabase (open source Firebase clone that uses PostgREST under the hood). At the time there was no join support which was incredibly limiting, and we ended up using Hasura instead.
Overall though, the user experience was generally very good and we still use it for some smaller standalone components on otherwise static websites (e.g. a mailing list signup form) where you don't need complex joins.
A running joke in one of the companies I know was that whenever backend developers messed up something, frontend developers would raise an idea to have it replaced by PostgREST :-)
There are python clients available[1]. For example with postgrest-py[2] you can do:
async with PostgrestClient("http://<yourhost>.com") as client:
r = await client.from_("countries").select("*").execute()
countries = r.json()
One advantage is that you don't have to worry about handling raw postgresql connections or pooling(as PostgREST does that for you), you only do HTTP requests. And if you want more complex SQL you can create functions or views.
A similar project built with intention around this idea is Tarantool[1]. I never hear much about it, but if you're interested in compute close to your data, this is definitely something that would warrant consideration.
Can someone give a 2-3 sentence summary of what this is? Some kind of javascript API to Postgres? Not an ORM though? Is the idea to have client connections to it from user browsers or anything like that? That sounds scary.
PostgREST is a standalone web server that turns your PostgreSQL database directly into a RESTful API. The structural constraints and permissions in the database determine the API endpoints and operations.
Is the idea to hit the database directly from the web browser, without a server component? Do people actually do this?
I mean, I can see the appeal, but I feel that having a layer of indirection between your database and your API is useful to keep the database somewhat normalized while making your API endpoints logically useful for the problem domain.
But, I haven’t been in the web programming game in over 5 years and pretty much stuck with plain old rails back then, so I think I’m just out of touch in general.
This is not a postgres extension. It is a standalone server application (built with Haskell), so you have 3 pieces: postgres itself, postgrest and a frontend layer/app/site (it doesn't have to be a web browser btw). The intention is to replace alll of the dumbass CRUD api's that we normally build for each project. With that capability you can build like 95% of what you need on the frontend side, most likely. This assumes you will put all of your extra business logic in the frontend. Wether or not that is the right thing to do is another discussion, but this projects can be super helpful, esp for internal projects that's never on the public web. Or go for it, just tripple check that you aren't opening yourself up for attacks.
So no, it doesn't expose your postgres instance directly to the internet, that would be nuts. It's just a server application that essentially generates an endpoint for each table and it enforces all the rules and constraints of the schema (as close as possible) without you having to code all those things by hand. That's the short version, it has a ton of other great features.
I recommend you read the docs, they do a nice job of explaining what it does and where it fits.
This still sounds nuts. It lets the web client (controlled by the user, which in this context is pronounced "attacker") do any db queries/updates that the backend application in a normal app stack could. Who wants to write their whole app in browser JS anyway? Even if you do want to do that, do you end up having to write the sensitive parts it as psql stored procedures to stop the client from pwning the database?
As a solo dev/data person, an appeal of PostgREST is that I can effectively encode the database schema and the API schema in the same step. And operationally speaking, it's easy to figure out how the API is configured without having to look at application code or application configs -- just look to the database.
A simple translation layer between the database (to address your worries about indirection) is to create a materialized view for each API endpoint. Need to expose a new column? Easy, just redefine the view. Isolation through schema usage can go really far here also.
Not sure if I would put a production system on PostgREST, but I'd love to hear others' experience if they put it in prod ever.
To be more specific, you can add that layer of indirection in your database. Have a schema called something like "api", and only let it contain views/functions/procedures that reference one or more "real" schemas underneath. Point PostgREST at that. Now you can change your data model without breaking your API contract, and can do so transactionally which is a benefit you don't get with other indirection layers.
This is a good approach even when you do have a traditional API middleware layer, but why have more than one?
Biggest downside is many folks are SQL-shy so the middleware devs may not want to or may not be able to transition in this way.
There are a couple of key features that PostgREST provides that I have not seen in other web frameworks. That's not to say others don't exist, but there are some definitely modeling and security anti-patterns that have become unfortunately deeply entrenched in many frameworks I've had to work with:
1. Your web framework logging into the database as a superuser.
This is huge: "kitchen-sink" frameworks tend to log into the database as a superuser, typically so they can "do migrations" from some SQL-degenerate ORM.
PostgREST (as documented) logs into the database as a user with NO privileges at all, and switches roles (with SET ROLE) to the role defined in the signed JWT token. If you never sign a token with a superuser role, users can never run queries as a superuser. You can also enforce host based security so that your web servers can't even SET ROLE even if they had a bogus signed token.
2. Buffering full database query responses as garbage collected "objects" and then serialize them to JSON.
Another problem proxy frameworks have is that they compose a SQL query, run it (typically as a superuser, sigh), then buffer then entire response in memory, typically as "objects" that need to be immediately garbage collected, and then rendered into JSON in another buffer! Finally the response is then streamed to the browser.
Postgres speaks JSON, and can produce rows of data as JSON natively. This data can be streamed directly to the browser before the query completes. This is a huge cost. Django workers typically need several gigabytes of RAM, proportional to the largest possible query response your database can produce. PostgREST does no buffering, parsing, or rendering, it just streams the json response directly from the database. Processes typically need only one or two hundred megabytes of RAM at most.
3. Tables and Views are elements of an object oriented type system. Not just a storage mechanisms.
Postgres is an Object-Oriented Relational Database Management System (OORDBMS). Tables are not just places you stick stuff, they are types and rows of data are instances of those types. Views are types composed of sub-types, which can in turn be views or tables. Postgres rewrites a query until all the views are expanded, like macros, until it gets one big query you never have to ever see.
PostgREST leverages all of this power of the database, you can model in the language that is the best for modeling data, SQL.
I'm using sandman2[0] as a wrapper for psql and mysql databases. how is it different or any feature/differentiating factor I should consider for my future wrappers?
Does sandman support joining, filtering to only fetch some columns, full-text-search, authentication to use different roles based on a token, defining callable functions in sql?
Looking at the sandman docs those were some features that I've used a lot in postgrest but weren't mentioned in sandman.
PostgREST is used inside every Supabase project. We have >50K projects now[0], some of them making (tens of) millions of API requests through PostgREST daily.
It's a fantastic piece of tech, especially when coupled with PostgreSQL Row Level Security. We have some benchmarks here in case you're concerned about performance: https://github.com/supabase/benchmarks/issues/2
[0] hosted platform. We don't add telemetry to self-hosting
Sorry to hijack the thread a bit, but yesterday I watched a supabase tutorial on row level security and I'm very intrigued to try out more. But what I couldn't figure on quick search is, what the best practices are in order to know what the current users permissions are before hitting a permission denied. E.g to show/hide certain actions in the frontend. Do I have to duplicate the logic somewhere else or am I missing something to get them from Postgres/Supabase easily?
I see what you mean - yes these rules would live separately from RLS. However we are also about to add this functionality to the Supabase Dashboard, and we will open source our solution to every Supabase project. (We will come on with a very general RBAC/ABCA solution)
You dont write code directly in db as you dont write code directly in python web app. Even though you can edit production live files, you dont do this. Instead you do changes in version control and deploy. Same goes for DB. Change fluway scripts, version control them and do deployment.
But it would be much better if Postgres (or any SQL DBMS) were a .git recipient where you push to deploy the triggers and stored procedures.
There are hundreds of “schema upgraders”, all based on their flavor of XML, JSON, text files and their associated name comventions, precisely because the canonical SQL way of doing it is limping.
Why add it as a concern to a database engine when it's down to the individual use case? I mean, there's hundreds of database migration tools out there, all you need to do - as a developer, that's your job - is to add some automation to your git repository and database server's master to run a migration. Set up a cron job to pull every X minutes, set up a post-merge hook to run those migrations. Simples. Just use straight SQL files for these migrations.
I'd rather say that there are "hundreds of schema upgraders" because there are hundreds of ORMs instead: it's usually a database migration tool (or two) per ORM.
SQL DBMS come with an obvious text-only schema management tool called SQL-in-text-files. If you don't need anything esoteric, number them in order of application (001-base.sql, 002-add-customer-model.sql, 003-trigger-on-name-update.sql...), and you are good to go. Any VCS can deal with those properly.
I've used plenty of saner version control systems than Git, and I wouldn't want a database to restrict me to it (even though I am usually "forced" to use Git nowadays).
> If you don't need anything esoteric, number them in order of application (001-base.sql, 002-add-customer-model.sql, 003-trigger-on-name-update.sql...)
I would just suggest people use database migration tools from the start. This "autoincrement" method breaks down easily when there are multiple developers working on the same project.
Sure, though the two obvious problems are solved easily: make developers "reserve" a number ahead of time, and keep a list of applied revisions in the DB table.
I was on a team doing this 15 years ago, and we did some pretty hardcore DB migrations on a large, complex database without any fuss with >30 developers working on it at the same time.
One thought I had is to put the schema in a flat file and put the migration on the commits (with mechanical checking that the two agree). I may've prototyped it at some point, I don't recall - I'm pretty sure I never used it in anger.
I'm not sure why you're getting downvoted, this is a fine question.
One approach is to use Point In Time Recovery. When you run your migration, you take note of the LSN (the point in the WAL stream) just before you make your changes. Then you can roll back to the point in time right before you applied the migration.
Note that you will need some other mechanism for restoring any data you added after the migration point.
While general best practice is that you version your changes in git and deploy migrations with something like FlyWay I've also run into the problem of needing some form of 'audit log' to make sure no DDL changes are being made in prod (and if they are, moving them over into version control).
I have a migration which is something like this one: https://www.alibabacloud.com/help/doc-detail/169290.html which uses a trigger on `ddl_command_end` in PG to copy over the query which made the DDL change from `pg_stat_activity` to a new audit schema to stash. Can definitely help with maintenance and finding out what happened when.
./pg_schema_dump.sh breaks down the schema into an entity-per-file structure at ./sql/schema, while.
./db_init.sh knows how to create a fresh database schema from this dump.
the per-file breakdown allows to nicely version the schema in git.
One way I've approached this in the past is to dump the database schema after each schema modification and use version control on those schema. There's a python project out there (apologies: I can't recall the name of it right now) now that leverages pg_dump and its options to make this easier. Not perfect, but useable. And better than nothing!
Ha - I find unsubstantiated, parroted mantras like this to be a team lead smell :)
In the same way that Agile development has become a mindless cult of rituals that has little to nothing left of the original manifesto's vision, code reviews peppered with thoughtless "code smell" comments have become the bane of development.
Indeed on a code review that kind of comment is bad. We have sonarcube for that kind of nonsense.
The reviewer should explain why, and ideally show how it could be done better.
Maybe, an app I inherited used this as it’s interaction with the database. The main issue was that because it wasn’t really sophisticated enough to do the queries they wanted so they wrote a bunch of stored procedures (in JavaScript?!) and used it to call them. This fairly rapidly led to the database getting overloaded.
I guess it’s maybe a smell because you’re limiting the database capabilities down from SQL, which is very powerful, down to the postgrest set of abstractions. Maybe that’s okay for your app now and maybe it always will be. But in this case, it wasn’t.
what does PostgREST take away from you over SQL? Sure it has limitations in what it can API-ify, but couldn't you stick anything more exotic in a function or procedure?
It's not something I'd grab quickly, but at the same time, it looks great for doing 'pure' CRUD applications. You'll probably need serious database administration skills though, I've yet to see anyone do anything with database account and permission systems that isn't straight root access from an application.
I investigated this and a few others for quickly spinning up a back office app, and settled on postgraphile. The endpoint that is exposed from the service is authenticated and a role is then set on the request object which postgraphile then uses to set the role for the database transaction. With appropriate roles set in the database, my back office users now have fine-grained permissions for reading/writing only to tables that I want them to.
It seems to me that it's the main limitation of this solution - authorization is limited to table scope so if you can update any row, you can update all rows for all users. Same for data querying.
That's not true; you can use row level security (RLS) to control access (both reading and writing) on a per-row basis. You can think of it as similar to an implicit "where" clause that automatically gets added to all requests.
Not the author of the post, but I guess the author meant that using JavaScript to communicate directly with PostgreSQL database over PostgREST is "architectural code smell": in this case database can easily be hacked and everybody could write to/remove anything from such an unprotected database...
Next time this instinct strikes you try doing 30 seconds of research. I’ve not used it but have briefly read about PostgREST and every single complaint you’ve listed here is inaccurate.
here's a short video explaining how this project can allow a front-end developer unleash PostgreSQL's power without having to write backend code.
https://drive.google.com/file/d/1ncp00Mb7L3TO9TIRgNEF6pRdt-b...