Hacker News new | past | comments | ask | show | jobs | submit login

The only problem with this approach that I have encountered is that is requires defining fairly arbitrary limits that are either so small that some user will inevitably run into them or so large that you're not getting as much protection as you think.

NB I don't know what the answer to this is, but pretty much any time a system I have been involved with contains a "reasonable" limit then people want more than that pretty quickly!




The problem you describe is not actually specific to this approach, the same would apply to any Size check, or any limit in general.

In fact, I think the problem is less relevant to a Quantity check, where an order of magnitude headroom above normal usage goes a long way, more so than a Size check.

For example, do you know of people who receive 10,000 attachments per email? This limit would be far and away above any reasonable usage and yet provide decent protection at the same time.


If you choose an arbitrary limit orders of magnitude above normal use, then you probably don't have any protection. Most systems are scaled to reasonable use, so an additional 1000x load in a dimension could bowl over the system.

Even defining "normal use" is intractable. For instance, most docker layers are a few MB, but some people are deploying 3P software packaged as a container with 10 GB in a single layer. You can't fix their container. They can't fix their container. Your definition of reasonable changes, and you bump your maximum to 1 TB. Then someone is trying to deploy docker containers that run VMs, which have 1.5 TB images. It's to interface with legacy systems that are infeasibly difficult to improve. But the vhd is a single sile, so now you have a single layer maximum size of 1.5 TB. But since the 10 GB body size is a possible attack vector in and of itself, what's the security benefit of having any maximum size limit at this point?

It's the wrong approach. Instead, your system should gracefully handle objects of arbitrary size. Security should be enforced by cryptographically enforced access controls and quotas.


It's essentially the arbitrariness of that 10,000 number that I'm queasy about.

e.g. When I thought of a "reasonable" maximum number of attachments on an email I'd probably say 64.


"When I thought of a reasonable maximum number of attachments on an email I'd probably say 64."

We should not be talking of a "reasonable" maximum at all.

Rather, the maximum should be orders of magnitude away from the "reasonable".

At this distance, any arbitrariness quickly fades into meaninglessness.

Detached from the "reasonable" usage, the maximum then becomes informed by processing cost and complexity analysis, which are more concrete.


This assumes (and likely correctly assumes, in most cases) that 'reasonable limit' and 'probable system limit' are orders of magnitude apart.

I think that may not always be true though. Especially for highly variable workloads.

How many requests per seconds does your site handle on average vs how many can it handle? What about spikes?

If your previous peak is at 10 rps and you can probably handle 20 rps, when should you start dropping connections to keep the site up?

Most systems are designed to have small margins, for cost purposes, so I imagine this sort of situation might come up quite a bit.


I would disagree, a sensible limit that cannot overwhelm the machine even if a multi threaded server is processing its maximum number of emails, should be enforced.

The correct response to a breach of this limit would be for a reply email to explain the limit and why the email was rejected.

The user then could send multiple emails with their attachments, and the system can be sized to handle e.g. 24 threads processing 64 attachments of a maximum size of e.g. 4096kb

That's how you ensure a system sized to your hardware and ensure maximum throughput for all users.


"That's how you ensure a system sized to your hardware and ensure maximum throughput for all users."

Sure, I think we are actually in agreement. That's exactly what I meant by "processing cost and complexity analysis".


It can be a random number between 64 and 10000, inclusive, the arbitrariness of it is the point: there is a bound; the (loop/size/whatever) is not unbounded.


But that's just from the 'security' perspective - if it is a low number then users are likely to hit it and complain, and for bug tracing I suspect everyone would want the number to be constant.


> The only problem with this approach that I have encountered is that is requires defining fairly arbitrary limits that are either so small that some user will inevitably run into them or so large that you're not getting as much protection as you think.

Some sort of lazy or other on-demand evaluation can help a lot here. You set the limits large, and then only evaluate the part of it that's actively being accessed.


> The only problem with this approach that I have encountered is that is requires defining fairly arbitrary limits that are either so small that some user will inevitably run into them or so large that you're not getting as much protection as you think.

Would an answer be to have limits that are tunable? I.e., something like "Attachment limit exceeded. Enter new temporary attachment limit for re-scanning: _____"?


Never done before (also i pick magic numbers from the air) but probably the idea is to look at Statistics of the data distribution and pick by Pareto the upper bound that cover more?

Is this data available? One simple one: How much long must be a name in a database field?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: