IETF: The HTTP Query Method ( Draft)

jasonhansel · on Jan 4, 2022

> caches SHOULD first normalize request content to remove semantically insignificant differences, thereby improving cache efficiency

This feels like a bad idea, since (a) different caches will support different content types and normalize them in different ways, leading to unexpected changes in behavior and (b) some servers may behave differently depending on something that a cache considers to be a "semantically insignificant" distinction. I'm not sure, in other words, if I trust caches to get this right.

It seems like it might be better to require clients to submit requests in a "pre-canonicalized" form, or to have caches allow this behavior but disable it by default.

mjevans · on Jan 4, 2022

This is what I would like to see as a logical conclusion that is future proof:

The query and response MUST be transmitted and stored 'as is' (a sequence of octets).

The query and response SHOULD be encoded in either UTF-8 or WTF-8 https://simonsapin.github.io/wtf-8/

Future standards or non-standard systems MAY use different encodings; conformant implementations MUST NOT alter the sequence of bytes. They MAY perform a validation check and add additional headers.

progval · on Jan 4, 2022

From your link:

> WTF-8 must not be used to represent text in a file format or for transmission over the Internet.

mjevans · on Jan 4, 2022

It's eventually going to be anyway. This is a problem that isn't limited to Rust and single hosts.

Also, as per the rest of my ideal solution, the true encoding is: Sequence of Octets / Bytes

dragonwriter · on Jan 5, 2022

Sequence of octets isn't an encoding, and content-aware intermediaries need to know the actual encoding.

anticristi · on Jan 4, 2022

My OCD is really happy that the symmetry is restored. I always felt that GET with the optional-but-actually-forbidden request body stands out. Once we transition from GET to QUERY, all HTTP transactions will be header+body in one direction, followed by header+body in the other.

unilynx · on Jan 4, 2022

There will be no transition from GET to QUERY, they will exist next to each other.

Odd proxy bugs/client quirks with zero/non-zero Content-Length headers in GET requests will remain with us forever

dragonwriter · on Jan 4, 2022

> Once we transition from GET to QUERY, all HTTP transactions will be header+body in one direction, followed by header+body in the other.

Why would you put a body in a HEAD, OPTIONS, TRACE, or DELETE request (or GET, which will continue in use alongside QUERY)?

And, no, a lot of response also don't have bodies, and that will continue.

danidiaz · on Jan 4, 2022

> QUERY requests are both safe and idempotent with regards to the resource identified by the request URI. That is, QUERY requests do not alter the state of the targeted resource. However, while processing a QUERY request, a server can be expected to allocate computing and memory resources or even create additional HTTP resources through which the response can be retrieved.

The possible creation of extra HTTP resources (response resorces?) seems to me contrary to idempotency. That seems more like the territory of POST.

If two identical QUERY requests might produce different response resources, how to square that with the the fact that QUERY will be cacheable?

twic · on Jan 4, 2022

> The possible creation of extra HTTP resources (response resorces?) seems to me contrary to idempotency.

If two repetitions of a QUERY request create the same extra HTTP resource(s), then it can be idempotent.

Idempotent means you can't tell the difference between 1 or N requests, not that you can't tell the difference between 0 and 1. Think about PUT, which is also idempotent.

jasonhansel · on Jan 4, 2022

Still, it is odd that a GET-like method is allowed to have side effects.

dragonwriter · on Jan 4, 2022

GET is (and all methods, including all safe and idempotent methods, are) allowed to have side effects, per the spec. Safe and idempotent are not mathematical constructs as defined in HTTP, they are more “business” constructs.

twic · on Jan 4, 2022

I know what you mean. It feels like we're missing an idea of scope for resources. If there was some kind of transaction scope or session scope or something, then a QUERY could create resources within that scope, so we could know that in the long run, it has no side effects. But that would be antithetical to the idea of statelessness perhaps.

Or maybe we just need distributed garbage collection for URLs.

wizzwizz4 · on Jan 4, 2022

> Or maybe we just need distributed garbage collection for URLs.

Printing a URL on paper leaks it. Writing a URL down, or memorising it, also leaks it – but the computer has no way of knowing this has happened.

I don't think this is feasible.

jpitz · on Jan 4, 2022

GET is allowed to have side effects, just not beyond the first invocation of a given request.

dragonwriter · on Jan 5, 2022

> GET is allowed to have side effects, just not beyond the first invocation of a given request.

GET can have side effects, and has no difference first and subsequent invocations (because it is safe as well as idempotent). Were it idempotent but not safe, it could have side effects that the client was accountable for the first request, but no different ones of that kind for subsequent uses.

jpitz · on Jan 6, 2022

Yes, and to that end, the draft defines QUERY as both idempotent AND safe.

scheme271 · on Jan 4, 2022

But pretty much every http request has side effects if you consider logging.

nybble41 · on Jan 4, 2022

The way I look at it is that the system must continue to meet its requirements (whatever they might be) whether it gets one GET request or many in response to a single action within the user agent (clicking a link, submitting a form, script making a request, etc.). In general, logging two requests instead of one does not violate any requirements and in fact logging every request, even duplicates, is the expected behavior. Adding the same item to a list twice in response to a single UI interaction, on the other hand, would not give the desired effect.

dragonwriter · on Jan 4, 2022

> The possible creation of extra HTTP resources (response resorces?) seems to me contrary to idempotency.

A GET request might create additional (or modify existing) resources, say if the API exposed it's own log via HTTP.

Both safe and idempotent are less expensive than one might naively think in the HTTP spec (which is good, because the naive understanding, while aesthetically seductive, isn't very practical at all.) Some quotes from the relevant bits of RFC 7231:

“This definition of safe methods does not prevent an implementation from including behavior that is potentially harmful, that is not entirely read-only, or that causes side effects while invoking a safe method. What is important, however, is that the client did not request that additional behavior and cannot be held accountable for it.”

“The purpose of distinguishing between safe and unsafe methods is to allow automated retrieval processes (spiders) and cache performance optimization (pre-fetching) to work without fear of causing harm. In addition, it allows a user agent to apply appropriate constraints on the automated use of unsafe methods when processing potentially untrusted content.”

“Like the definition of safe, the idempotent property only applies to what has been requested by the user; a server is free to log each request separately, retain a revision control history, or implement other non-idempotent side effects for each idempotent request.”

“Idempotent methods are distinguished because the request can be repeated automatically if a communication failure occurs before the client is able to read the server's response.”

brainwipe · on Jan 4, 2022

In HTTP idempotent is where the state of the server remains unchanged. A resource in HTTP is what you get back from an URL, so https://example.com/people/show/10 (where 10 is the page number) is a different resource to https://example.com/people/show/100.

If I interpret it correctly - adding more resources isn't changing the server state but adding more ways of getting to the state.

CrazyStat · on Jan 4, 2022

> In HTTP idempotent is where the state of the server remains unchanged.

This is incorrect. DELETE is idempotent but changes the state of the server.

dragonwriter · on Jan 4, 2022

Also PUT.

dragonwriter · on Jan 4, 2022

> In HTTP idempotent is where the state of the server remains unchanged

No, it's not. That's closer to “safe” than “idempotent” (safe also implies idempotent, but not the other way around), but even then it is not quite right, because even safe methods are allowed to have side effects, but their is guidance about the kind and impact of side effects that it shouldn't have.

danidiaz · on Jan 4, 2022

So that means that, without a cache, repeating a QUERY might create two response resources but, with a cache, only one will be created. I find that odd. My understanding of HTTP idempotency is that it's more of a "whole-server" concept (excepting perhaps things like creation of log entries and metrics). Always creating a new resource for each request seems contrary to that.

A way to square creation of response resources with idempotency could be: the second identical QUERY that arrives should always reuse the result resource created by the first QUERY.

CrazyStat · on Jan 4, 2022

What if the underlying data has changed?

If I QUERY the current price of a stock, and then someone else sends an identical QUERY ten seconds later, they might get a different result. This is not because QUERY isn't idempotent.

danidiaz · on Jan 4, 2022

I think that, when talking about idempotency, there's the implicit assumption that the "rest of the world" stays the same while the sequence of operations is performed.

rfc2616 says:

> Methods can also have the property of "idempotence" in that (aside from error or expiration issues) the side-effects of N > 0 identical requests is the same as for a single request.

https://datatracker.ietf.org/doc/html/rfc2616#page-51

Perhaps changes in the underlying data could be considered "expiration issues". Otherwise not even GET could be considered idempotent in many cases.

CrazyStat · on Jan 4, 2022

Idempotency is not about "you get the same result", it's about the effects of your http request on the server. Notice that the definition you quoted is in terms of side-effects, not results.

If a request changes the state of the server and another identical request changes the state of the server in a different way, it's not idempotent.

If a request doesn't change the state of the server at all it is idempotent, even if subsequent requests might get different responses (e.g. the stock quote example in my previous post).

If a request changes the state of the server but repeated identical requests don't have any different effect it is also idempotent. For example, DELETE is idempotent because DELETE-ing something N times is the same as deleting it one time.

samhw · on Jan 4, 2022

I think this is pointing to the problem with your definition of 'idempotent'. Idempotency simply means that any number of additional identical requests will have the same effect on the state of the resource, not that they will have no effect. (And by 'have the same effect', we mean 'produce the same state', not 'alter state in the same way' - effects are algebraic projections.)

That's why it's called idempotent - 'doing the same' - rather than impotent.

dragonwriter · on Jan 4, 2022

> rfc2616 says:

“Obsoleted by: 7230, 7231, 7232, 7233, 7234, 7235”

dfox · on Jan 4, 2022

As I read it I think that the idea there is to allow usage of pattern where the resulting resource refers to other resources that somehow encode the contents of the QUERY request body in their URL (or even results in redirect to such resource). For example the result of QUERY is page with html table of the data which also includes server-side rendered chart of the same data as an external image.

[Edit: the return redirect to URL that somehow encodes the query usage is even given as an example in section 4.2]

anticristi · on Jan 4, 2022

I was wondering the same, but this seems to clarify that:

> The cache key for a query (see Section 2 of [HTTP-CACHING]) MUST incorporate the request content.

denton-scratch · on Jan 4, 2022

> idempotent with regards to the resource identified by the request URI

That means that a QUERY request can change the state of the server, for example by creating new resources; there's exactly one resource it's not allowed to change.

If I've read it right.

unilynx · on Jan 4, 2022

That has always been the case ... requests get logged, and if the server exposes its access logs over HTTP, that's one thing for which a request won't be idempotent

Idempotent etc in the HTTP specs has always been more or less an attempt at a promise to the client "you should be able to repeat this request if you're not sure about success/failure without anyone claiming to implement HTTP being able to throw the book at you".

Just like GETs shouldn't have side effects. But in practice of course, things like https://thedailywtf.com/articles/the_spider_of_doom happen

rhdunn · on Jan 4, 2022

A resource is defined by a path, so if you have a `QUERY /documents` or `QUERY /albums` endpoint, the resource is all documents or albums that you are searching across, so it cannot add one of those items (like `POST /album`). It is possible that this could affect some other resource (e.g. an audit trail), which would mean that a `QUERY /logs/audit` endpoint must not add an audit log entry per the idempotent requirement.

marcosdumay · on Jan 4, 2022

Hum... You are complaining about a request having the side effect that a server may fork another process to answer it? That's not really much anybody can do about this.

amdelamar · on Jan 4, 2022

A year ago I had a coworker build a CRUD app with an API that required GET with a body. Something simple like:

    GET /api/query
    Host: example.org
    Content-Type: application/json
    
    {
      "a": "valueWith$pecialChars",
      "b": "valueWith$pecialChars",
      "limit": 100
    }

It worked fine for him because he used curl, which allows GET with a body. But I was using Paw (similar to Postman) which refused to send it. I mentioned the issue to him to which the reply was along the lines of "its a non issue, just use curl". I kid you not, 1 week after this coworker left for another job I fixed the service to accept POST requests.

If QUERY was around I'm sure I could've made a stronger case to fix it sooner.

torgard · on Jan 4, 2022

> required GET with a body

Elasticsearch also encourages GET with body. But a request payload is undefined, according to the RFC:

   A payload within a GET request message has no defined semantics;
   sending a payload body on a GET request might cause some existing
   implementations to reject the request.

spyspy · on Jan 4, 2022

I’ve never accepted GET with a body but I’ve been burned several times using curl to test APIs. curl and browsers are very different beasts.

wojcikstefan · on Jan 4, 2022

I'm very happy about this proposal. The only sad thing is that it has come so late, after so many tools and protocols (e.g. GraphQL) already abuse POSTs for this use case.

hermitdev · on Jan 4, 2022

I agree. It takes me back arguments I had with my PM when I worked for a small SaaS close to 10 years ago. I had to use POST for a query API because of the limitations around GET & URL encoding of the parameters for the exact reasons outlined in TFA. She insisted it be a GET until I showed real, existing client queries that couldn't be handled. Only then, did she relent. Same PM also insisted I send results of queries as a list of objects in JSON, instead of a more compact tabular format, because tables aren't REST-y. I lost that battle, and the serialized results of queries were an order of magnitude larger than they needed to be...

Spivak · on Jan 4, 2022

> She insisted it be a GET until I showed real, existing client queries that couldn't be handled... Same PM also insisted I send results of queries as a list of objects in JSON, instead of a more compact tabular format, because tables aren't REST-y.

I think I'm on the side of the PM with this one on both counts. You sound like someone who really cares about efficiency, performance, and edge cases -- a proper engineer. But PMs are supposed to bring us down to earth and say that simplicity and maintainability are more important than saving bytes and to not waste time fixing things that aren't broken.

In a past life I spent so much effort optimizing our stack to lower our AWS bill until a PM sat me down with the company's finances and showed me the teeeeny little bar that was our cloud expenses and then legitimately 20x taller bar that was salaries and basically said that spending money to buy back my or my team's time was more important.

wizzwizz4 · on Jan 4, 2022

> lost that battle, and the serialized results of queries were an order of magnitude larger than they needed to be...

Before or after Content-Encoding: gzip?

badrabbit · on Jan 4, 2022

This is cool and all but why not just expand the scope of GET requests in newer HTTP standards? Maybe have a X-GET-QUERY header to indicate the type if GET request? the problem I see with a new method is that it isn't just webservers that need to support it, it is also webapps. Ideally this would be transparent to the webapp (which would just see really big arrays of GET params). The user-agents (browsers) would ideally support this transparently where as with a new method the JS/HTML would need to explicitly support it.

wojcikstefan · on Jan 4, 2022

> Maybe have a X-GET-QUERY header to indicate the type if GET request?

Note that using the "X-" prefix has been deprecated since 2012: https://datatracker.ietf.org/doc/html/rfc6648

badrabbit · on Jan 4, 2022

Wow, news to me. Thanks

wizzwizz4 · on Jan 4, 2022

Here's the relevant section of the guidelines: https://datatracker.ietf.org/doc/html/rfc6648#section-3

   Creators of new parameters to be used in the context of application
   protocols:

   1.  SHOULD assume that all parameters they create might become
       standardized, public, commonly deployed, or usable across
       multiple implementations.

   2.  SHOULD employ meaningful parameter names that they have reason to
       believe are currently unused.

   3.  SHOULD NOT prefix their parameter names with "X-" or similar
       constructs.

   Note: If the relevant parameter name space has conventions about
   associating parameter names with those who create them, a parameter
   name could incorporate the organization's name or primary domain name
   (see Appendix B for examples).

brightstep · on Jan 4, 2022

One of the issues that QUERY solves is that POST is overloaded and is being used for purposes beyond its intended responsibility. Shifting that overloading to GET feels to me like just another hacky approach. I prefer the well-defined, single responsibility that QUERY brings and restores to POST.

throwoutway · on Jan 4, 2022

Once browsers offer support, the web app+ server would need to support it. Both of those are in control of the devs. The real problem lies in getting infrastructure teams to update the expensive F5 load balancers, and PA/CP/TP firewalls to process the requests. Those aren’t in control of the devs (unless they’re operating together well as a team)

dagss · on Jan 4, 2022

With an official readonly header to POST instead all middlewares and proxies would have automatic support and this can be adopted in months instead of years or decades...

tsimionescu · on Jan 4, 2022

Why would that be easier to implement than a new method? It's literally just a string. Many non-standard request methods are already supported.

glenjamin · on Jan 4, 2022

Because unknown headers are already passed through safely in existing implementations, whereas unknown methods are handled in a variety of different ways

ferdowsi · on Jan 4, 2022

> The QUERY method provides a solution that spans the gap between the use of GET and POST. As with POST, the input to the query operation is passed along within the payload of the request rather than as part of the request URI. Unlike POST, however, the method is explicitly safe and idempotent, allowing functions like caching and automatic retries to operate.

Is this really worth a change to every HTTP client library out there to support this? The limited applications that really need this can easily use POST and document their own semantics around this.

If anything the trend with GraphQL is to ignore HTTP verbs outright because they are limited and inexpressive beyond simple CRUD tasks.

brightstep · on Jan 4, 2022

Definitely. The HTTP spec has a gap that's being filled with a hack, albeit a widely accepted and implemented one. QUERY removes ambiguity, aids self-documentation of APIs, and improves caching.

jayd16 · on Jan 4, 2022

It also seems trivial to fallback to POST for backwards compatibility, no? I'm not sure it needs every lib to be updated before devs can gain value from this.

jonwinstanley · on Jan 4, 2022

Agreed. Seems like a slight improvement over something that was decided and settled many years ago.

cryptonym · on Jan 4, 2022

POST request caching is always tricky and often not allowed. It can also improve reliability: you can safely retry such request.

davidhariri · on Jan 4, 2022

I'm in favour of a QUERY verb. Always felt wrong to use GET or POST for it.

tester34 · on Jan 4, 2022

What is the difference between this and allowing HTTP GET with Body?

soheilpro · on Jan 4, 2022

From RFC 7231:

   A payload within a GET request message has no defined semantics;
   sending a payload body on a GET request might cause some existing
   implementations to reject the request.

NovemberWhiskey · on Jan 4, 2022

Right. The HTTP RFCs have been backing off gently from the initial position that implementations should not sent bodies with GETs and that the semantics of the GET request were defined purely in the request URI.

But presumably no-one is brave/foolhardy enough actually to redefine GET as having a semantic body because a bazillion different implementations (clients, servers and middle boxes) probably become non-compliant.

tester34 · on Jan 4, 2022

>redefine GET as having a semantic body because a bazillion different implementations (clients, servers and middle boxes) probably become non-compliant.

So what actually?

apps that didnt use GET Body, will not care anyway

apps that will use HTTP GET Body will be checked anyway

So, unless somebody downgrades HTTP Server then what could be the problem?

dagss · on Jan 4, 2022

Many, many services (most of the internet?) has the backend sitting behind a proxy that would throw away the payload before it gets to the "apps".

Granted they need to get support for QUERY too but at least it is more explicit then.

An official readonly flag to POST would have been more backwards compatible...

dragonwriter · on Jan 4, 2022

Aside from how much easier it is to identify whether a component supports QUERY than which forms of GET it supports, GET and QUERY (like PUT and DELETE) have similar guarantees have different meaning and are sometimes (but not always) useful against the same resource for different purposes. OPTIONS lets you tell the availability of that of they are different methods, but not if one is GET w/o body and the other is GET w/body.

saurik · on Jan 4, 2022

Are they more or less likely to be compliant, though, than with a new verb?

detaro · on Jan 4, 2022

Handling a method or not is a much more obvious thing to discover/observe.

tester34 · on Jan 4, 2022

Yea, that's what I meant

What if we allowed HTTP GET Body?

dragonwriter · on Jan 5, 2022

> What is the difference between this and allowing HTTP GET with Body?

What's the difference between PUT and allowing DELETE with a body?

politelemon · on Jan 4, 2022

> Implementations are free to use any format they wish on both the request and response.

The samples should include some non-SQL, completely made up ones, as I think a lot of people are going to fixate on the SQL-like syntax and its associated problems.

dschep · on Jan 4, 2022

it does https://www.ietf.org/archive/id/draft-ietf-httpbis-safe-meth...

efitz · on Jan 4, 2022

Remind me again why we are using a document transfer protocol for API use cases and trying to bolt on functionality like Frankenstein’s monster?

mpolichette · on Jan 4, 2022

I'll entertain this...

Because, like it or not, you're often still sending/receiving documents? :)

Also the semantics of the actions map really well to the use cases still.

You're, of course, more than welcome to avoid the whole thing and use WebSockets or WebRTC data channels with a custom protocol.

the_arun · on Jan 4, 2022

Would this help GraphQL to use HTTP in proper way? like use HTTP QUERY for non-mutational requests?

dschep · on Jan 4, 2022

Doubtful given that GraphQL doesn't prescribe a transport layer. [0]

[0] https://graphql.org/faq/#does-graphql-use-http

chrismorgan · on Jan 5, 2022

Doesn’t prescribe one, no, but when it is over HTTP, it’d be perfectly reasonable to have it accept QUERY for non-mutating requests, like it can currently use GET or POST.

intrasight · on Jan 4, 2022

I for one welcome the addition. And am sort of surprised that it wasn't made a verb earlier.

sdesalas · on Jan 5, 2022

Next up, we'll be getting an ENQUIRE method.

Then we'll be ready for INQUIRE too. Which is similar but involves more formalised header definitions.

Following this. We'll get an RFC for OBTAIN, which will be similar to GET but will formally renounce the use of a payload.

zestyping · on Jan 4, 2022

There's no way to do client-side caching with this, which seems like a fatal omission — in any given situation where you would consider using QUERY, it'll almost always be more efficient to put the query in the parameters of a GET requset.

remram · on Jan 4, 2022

Why do you think you can't do client-side caching?

zestyping · on Jan 7, 2022

My error. Indeed you can use If-None-Match with QUERY to support client-side caching if you can generate an appropriate ETag.

kevincox · on Jan 5, 2022

It seems to me that the whole point of this is that it can be cached. If you didn't need caching you could just use POST.

the_arun · on Jan 4, 2022

I am unable to understand - what is the difference between GET and QUERY? Just that in QUERY you can send parameters in the request body? Do we need a new method for that?

tsimionescu · on Jan 4, 2022

Yes, because there are various assumptions about GET that won't fit if GET can suddenly contain a request body. For example, existing caching servers may continue to cache content based only on the URL and headers, ignoring the request body entirely, producing bad results. Additionally, there may be some more subtle problems related to the Content-Length header, which is supposed to NEVER be sent for GET, but would be required for QUERY (since all requests that can contain a body MUST have a Content-Length header, depending on encoding; while requests that can't contain a body MUST NOT have a Content-Length header).

distalx · on Jan 5, 2022

Would it only have affect on the way we query SQL databases or it would have an overall affect on the way we query things in general?

axiosgunnar · on Jan 4, 2022

Why not just allow bodies (that can be arbitrarily large) with GET requests?

JanLikar · on Jan 4, 2022

If we need GET requests with bodies, why not simply add a body to GET?

dragonwriter · on Jan 4, 2022

Because then you don't know if something that supports GET supports the new broader definition or the old definition, whereas if does or does not support QUERY is more clear.

Also because a different method means OPTIONS tells you information about what is supported, while overloading GET would not.

And because “same guarantees” doesn't mean “means the same thing”; PUT and DELETE have the same guarantees (idempotent but not safe), but we don't use PUT with no body for DELETE.

mrcarruthers · on Jan 4, 2022

I'm fairly certain that (technically) nothing's stopping you from doing so. However, there are so many libraries/clients/etc... that do not allow it that it would be almost impossible to patch them all. Adding a new method and having libraries add it and support it properly would be better.

throw_m239339 · on Jan 4, 2022

cause HTTP agents are allowed to ignore the body of a get request, per spec.

est · on Jan 4, 2022

HTTP middle boxes are going to love this.

indymike · on Jan 4, 2022

The HTTP Query method is problematic: every request to a web server is by definition a query, so it is at a minimum poorly named. Second, most queries are not idempotent and the return value can and will change. In other words, YAGNI.

layer8 · on Jan 4, 2022

I don’t agree that "every request to a web server is by definition a query", similar to how not every SQL statement is a query. In terms of command-query separation, commands may return information about the execution of the command; the fact that they return something doesn’t make them a query. For example, an SQL UPDATE statement may return how many rows were updated, or that some error occurred; that doesn’t make it a query.

indymike · on Jan 4, 2022

> I don’t agree that "every request to a web server is by definition a query",

Query is a synonym for ask. Request is a synonym for ask.

We are now making request requests?

This is redundant and is not needed.

layer8 · on Jan 4, 2022

“Request” and “query” are not synonymous. A query is a request for information. A request can also be a request for action.

indymike · on Jan 4, 2022

You just used request to define request. This isn't working.

layer8 · on Jan 5, 2022

I didn’t, look closely.

indymike · on Jan 5, 2022

"A request can be a request..."

layer8 · on Jan 9, 2022

I didn‘t define request, I gave examples of possible kinds of requests. There are requests for information, and there are requests for action (and there may potentially be still other kind of requests — it’s an open enumeration). My point is that queries are requests for information, and not requests for action. Therefore "query" is a proper subset of "request", and thus not every request is a query.

rvr_ · on Jan 4, 2022

This must be some form of late April 1st joke. The samples are even SQL-injection!

We need to stop modifying basic protocols like HTTP. We should've stop at 1.1.

nostoc · on Jan 4, 2022

Injection means you can modify an existing query. The example are not SQL injections, they are full queries.

otabdeveloper4 · on Jan 4, 2022

HTTP is not just for the web. In fact, the vast majority of HTTP trafic doesn't involve the browser at all.

The examples are realistic and useful. E.g., Clickhouse uses POST methods for queries, and a ridiculous `&readonly=2` parameter to differentiate modifying queries from readonly SELECT queries.

Angharad · on Jan 4, 2022

This is only a proposal for a new HTTP verb. How you use it (and how you interpret is) is completely up to you.

intrasight · on Jan 4, 2022

I don't see any sql injection in samples

mjb · on Jan 4, 2022

> QUERY requests are both safe and idempotent with regards to the resource identified by the request URI.

Is that really what you want from a query operation? I read 'idempotent' as implying that result sets don't change over time, which would be surprising behavior for queries for most database-like things.

It's probably also worth mentioning that SQL's SELECT isn't idempotent in the way HTTP means it, because of the existence of session state, pessimistic locking, and the requirements of higher isolation levels. It would be useful for an RFC to define 'idempotent' in a way that clearly addressed these issues (and, for that matter, the larger topic of sessions/transactions) more clearly.

> When doing so, caches SHOULD first normalize request content to remove semantically insignificant differences, thereby improving cache efficiency

Unfortunately, again when you look at SQL by comparison, queries are not purely expressions of what to return. Practically, they also encode how to compute the query (either explicitly through hints, or implicitly through things like join order). These behaviors are weird, tricky, and change version-to-version.

> The QUERY method is subject to the same general security considerations as all HTTP methods as described in

As another commenter said, this is quite incomplete. Query parameter injection, DoS by locking, DoS by exploiting work the database needs to do to ensure isolation, DoS by extremely expensive query, etc.

> 4.2. Simple QUERY with indirect response (303 See Other)

At least the examples here are naive - most applications don't want query result sets to be easily accessible to others. The semantics of authn and authz need to be really crisp here to make sure that attackers can't access the location of other queries result sets purely by guessing.

At least a "SHOULD use auth" or "SHOULD have large, unguessable, names" would be valuable here.

timwis · on Jan 4, 2022

Idempotency in the HTTP context is about the ability to make the same request multiple times without side effects.

https://developer.mozilla.org/en-US/docs/Glossary/Idempotent

zinekeller · on Jan 4, 2022

a) GET is also idempotent despite the fact that a re-request may retrieve a updated document.

b) Some developers are ignoring HTTP idempotency. Not IETF's fault for them abusing GET for deletion.

timwis · on Jan 4, 2022

But if you GET /posts/123 and then do it again, and in between, the author updated the post, you’d expect to get the latest version of the post, no? That doesn’t make it non-idempotent, because your GET requests did not change the state at all.

atuladhar · on Jan 4, 2022

I'm pretty sure timwis and zinekeller in this thread (and detaro in the sibling thread) are all saying the same thing. Idempotency implies the request in question does not change the state, not that the state would not have changed because of other operations in the meantime. GET and QUERY are meant to be idempotent but whether they really are in practice depends on how they've been implemented.

user3939382 · on Jan 4, 2022

Maybe the distinction is that the request (your request) is itself not responsible for the change.

brainwipe · on Jan 4, 2022

This is correct. Idempotent GET should not itself change the state of the server.

tsimionescu · on Jan 4, 2022

> But if you GET /posts/123 and then do it again, and in between, the author updated the post, you’d expect to get the latest version of the post, no?

Not necessarily - plenty of systems offer no such guarantees, and the Web is by design eventually consistent [0]. This is what content expiration and various other cache control mechanisms are for - it's not always so important to get the latest version of a document. For example, the HN logo or index.html can probably be safely cached for days, since they are very unlikely to change, and even if they do, it's unlikely to have a major problem if someone only sees the new version after a few days.

[0] Note that, at the extreme, due to special relativity, there is no absolute notion of "latest version" on the scale of geographically distributed computers: it's physically impossible to say if a request made in China to a server in the USA happened before or after a change on the server, if they happened close enough together - order of tens of milliseconds, an eternity in compute time.

squeaky-clean · on Jan 4, 2022

Idempotency only refers to state changes from subsequent invocations of the call.

A command to add a user to the set of users that have upvoted a post would be idempotent. Because you can run it 20x and only the first call affects anything. A command to increase the upvote count for a comment by +1 would not be idempotent.

timwis · on Jan 4, 2022

Good point!

marcosdumay · on Jan 4, 2022

Well, GET is not idempotent if the document has been updated.

The specs do not talk much about the document changing behind your back, only about the changes you cause by yourself.

tsimionescu · on Jan 4, 2022

But idempotency is relevant because of two things:

1) is it safe to automatically retry the request? - this meshes well with what you're saying

2) is it safe to return a cached version of the response, instead of sending the request again? Idempotence in your sense is necessary but not sufficient for this case - hence the various content expiration and If-Newer-Then etc headers.

detaro · on Jan 4, 2022

> I read 'idempotent' as implying that result sets don't change over time, which would be surprising behavior for queries for most database-like things.

That's not what idempotent means in HTTP.

> As another commenter said, this is quite incomplete. Query parameter injection, DoS by locking, DoS by exploiting work the database needs to do to ensure isolation, DoS by extremely expensive query, etc.

Is application-dependent and applies to all other HTTP methods too.

mjb · on Jan 4, 2022

RFC2616 says:

> A sequence is idempotent if a single execution of the entire sequence always yields a result that is not changed by reexecution of all, or part, of that sequence.

Which isn't, because of isolation, true in general of database queries. Obviously this is in context of RFC2616 saying that sequences of idempotent HTTP operations may not be idempotent in themselves, but that definition seems very incomplete in the context of database queries.

> Is application-dependent and applies to all other HTTP methods too.

Sure. But I don't think that's a good argument in the modern world. Over the 22 years, we've learned a lot about the security concerns of running secure systems, and it seems reasonable to include those concerns in a section labelled "security considerations". SQL injection is a classic security bug, and should be a key concern of any reasonable new standard for sending queries between systems.

A full security section should probably also mention cache timing side-channels, locking-related covert channels, and other similar concerns that come up when you increase the semantic power of HTTP. It's not that POST doesn't have these concerns, it's that we've learned in the last two decades that they are real problems for many kinds of real systems.

detaro · on Jan 4, 2022

HTTP idempotence is only concerned with effects of the request, not the result returned.

RFC7231:

> A request method is considered "idempotent" if the intended effect on the server of multiple identical requests with that method is the same as the effect for a single such request.

note the on the server.

(or old specs, 2616: Methods can also have the property of "idempotence" in that (aside from error or expiration issues) the side-effects of N > 0 identical requests is the same as for a single request. - again, side-effects, not responses)

How does a read-only database query being repeated cause a change in the database?

himinlomax · on Jan 4, 2022

> I read 'idempotent' as implying that result sets don't change over time

Idempotent means the request itself doesn't change stuff, it doesn't mean something else won't.

mjb · on Jan 4, 2022

You're right, and I was fuzzy about what I meant. I didn't mean (although wasn't clear) that QUERY would have to return the whole result set over time - clearly that's beyond the scope of HTTP's definition of idempotency.

However, because of the existence of isolation and locking concerns in databases, even fairly simple queries are not idempotent. RFC2616 goes to some effort to (fuzzily, unfortunately) talk about sequences of operations, which would be useful here.

hodgesrm · on Jan 4, 2022

It seems as if the idea behind this proposal is to help out database folks. If so, that is misguided. POST is a better implementation than QUERY (or GET) at least for SQL databases. Here's why.

In SQL this is a query:

  SELECT a, b, c FROM foo LIMIT 1

But this is also a "query" in many if not most connectivity APIs.

  INSERT INTO foo VALUES (1, 2, 3)

Most client libraries don't know and don't care about the content of the query. It's the database's job to parse it and and do the right thing. The different between the above queries is that the first one returns a result set and the second returns an update count. Here's a simple example using Python and the clickhouse-driver library.

  # An UPDATE to the database
  client.execute('INSERT INTO iris SELECT * FROM another_iris_table')

  # A harmless "query"
  result = client.execute('SELECT COUNT(*) FROM iris') 
  print(result)

For this to work you need to use something underneath that is generic and works regardless of output. POST does this already. The clickhouse-driver does not use HTTP protocol though other ClickHouse drivers do. I'm just using it as example of why you need a protocol than can handle any type of SQL "query" the same way on the wire. Otherwise the client will have to have a SQL parser to figure out which one to use. (Some clients actually do that but they are a very small minority.)

wojcikstefan · on Jan 4, 2022

IMO this helps a lot more people than just database folks. Any web application which implements a fairly granular search/filtering mechanism for its resources may run into the URL character limit with GETs. QUERY sounds much more appropriate here than what most applications do today (i.e. abuse POSTs).

hodgesrm · on Jan 4, 2022

In that case it's not helpful to tie it to SQL. As my examples demonstrated, it's pretty useless for SQL database connectivity. If you are looking a general query mechanism it would make more sense to have something that looks like GET with a body.

ClickHouse also supports GET as a verb. In addition to URL length issues the query needs to be URL-encoded which makes it difficult to read and debug.

p.s., It's interesting to see my post downvoted. It's more productive to show why it's wrong. I've worked on DBMS connectivity for over 30 years.

kevincox · on Jan 5, 2022

This isn't tied to SQL.

See https://www.ietf.org/archive/id/draft-ietf-httpbis-safe-meth...

> The non-normative examples in this section make use of a simple, hypothetical plain-text based query syntax based on SQL with results returned as comma-separated values. This is done for illustration purposes only. Implementations are free to use any format they wish on both the request and response.

brightstep · on Jan 4, 2022

The examples in section 4 are just that, examples. They are not intended to be the only format a query may take. The issue with POST that QUERY solves is lack of an idempotency constraint.