We published Polls and Predictions to people watching the 2018 World Cup. We set the vote callback URL to a function on AWS Lambda.
It failed spectacularly during our load-testing because the ramp-up period was far too slow. We needed to go from 0 to 100,000 incoming requests/second in about 20 seconds.
We had to switch to an Nginx/Lua/Redis endpoint because Lambda was just completely unusable. It would have cost us $27,000/month to pre-provision 10,000 concurrent executions...
It works pretty well for tasks that are executed infrequently (no need to have a full application deployed), out of band (no need for a rapid response; we'll email you) and for tasks that can consume excessive resources (threatening other apps running on the server).
For us the perfect use case is the production of PDF reports: they aren't run that often; they can be emailed a few minutes later; and when too many of them are run at the same time, they would steal all the RAM on the server.
Not entirely true. The largest use case we see for Lambda is as the compute service for an API backend provisioned with API Gateway (so a full application backend) with synchronous responses over HTTP (so rapid response provided within milliseconds) and for mundane tasks that store data in a database just like a regular web application.
The perfect use case we see repeatedly is for high volume HTTP requests to API Gateway endpoints that trigger Lambda functions that respond in less than a second depending on what compute is running.
Reads as the thing AWS wants people to use Lambda for.
Running Lambda as a back-end is one of the most expensive options (especially when high load comes in, as a parent pointed out). So no question, that's what they want to sell.
I consult and have encountered many people who build data intake pipelines using Lambda. The most common pattern is to have Lambdas feeding into a data lake somewhere, often Redshift.
It is a very common pattern for IoT applications also and well-suited for it. The workloads are often bursty and unpredictable.
But for many things, I believe it is still preferable to stick with a traditional web app or API. It is still early days for serverless and there is a lot of hype around it. As with most hyped things, people see it as a silver bullet and don't think through the tradeoffs they are making.
From direct exposition (via API gateway) to HTTP traffic, to very spiky loads such that we don't pay for hardware in between spikes. We've also have a kind of "hybrid pipeline" that is a mix of both HTTP requests from clients and SQS messages. The HTTP part sends messages to SQS, which triggers lambdas asynchronously so we get retries for free.
Your load doesn't strike me as being out of the ordinary, I suspect you had one of the following:
- Slow start, which means AWS was unable to re-use an already warm container to run the function a second start. This is particularly true with Java, although they've massively improved it recently.
- A dependency on another service which would have caused the slow start (loading a config is a common unexpected culprit)
- You mention 2018, so at that time running a lambda within a VPC was notoriously slow to start
- You had a limit on your AWS account (default is 1k concurrent I believe)
There are an enormous number of people using Lambda's as a part of very complex workloads including customer facing platforms. Your specific use case of 100 000 incoming requests in 20 seconds is actually far more extreme than most users use cases; Lambda can do 3000 instantly and then adds more capacity over time unless you contact AWS and ask them to increase the limits in which it can do a lot more.
There are very many users of Lambda that run very high volume production workloads (billions of requests a month) and save an enormous amount of money doing so compared to traditional techniques. A lot of that saving comes from the ability to create solutions in days as opposed to weeks or even months and the fact that for most use cases the capacity exists to handle the volume most users need.
Cron is a scheduler that executes shell commands. Lambda represents the shell command itself right? What are you using to specify when something should run?
Lambda can trigger off a bunch of AWS events. One of those events types are CloudWatch events, mostly used for triggering in response to things like CPU activity or other metrics. One of the CloudWatch events you can specify are plain old Cron events:
I recently built a facade for payment gateway callbacks to put message into SQS to reduce loss of requests from main API. (Costs less than $1 per month to handle all requests and # of lost requests in the last 6 months has gone from ~15 out of ~10k to 0)
Not AWS, but I use Cloud Functions in Firebase Hosting to redirect dynamic urls to static website paths.
Of course, in this case I have to use it, because Firebase Hosting is static, so this is the only way to handle dynamic urls if the simple rules of firebase.json are not enough.
at waiterio.com we use it for plain old web services.
Essentially one service = one lambda
Like a dyno on Heroku but much cheaper and much more scalable
We used the Serverless Framework until few months ago and now we are migrating to AWS CDK
>Also, I haven't seen anybody version control lambda code, even in compliance environments, so something to investigate.
We have anything that gets deployed to lambda go through a jenkins job. Pulls code from GH and builds/uploads it to lambda. Very few people have the ability to manually edit lambda jobs through AWS. While it isn't a perfect solution, it's worked pretty well for us.
> Also, I haven't seen anybody version control lambda code, even in compliance environments, so something to investigate.
Huh, how is it any different building and deploying to anything else.
I store the code in git, build it in team city, and deploy it to S3 using octopus deploy which updates the lambda to point to the new versioned off zip.
I just want to add in about version controlling Lambda code, this is entirely possible using the available tooling with something like the Serverless Framework at serverless.com
I guess that I was expecting AWS Lambda to scale up immediately in the same way the ALB or SQS scales up with demand. That assumption was very wrong.
Edit: my AWS support rep explained it: Without pre-provisioned capacity, it would take 34 minutes to scale our Lambda functions to the peak concurrency we needed. And it would cost several thousand dollars.
Completely insane cost. Also, EFS is ridiculously expensive.
I've found EFS enticing in theory but painfully slow and riddled with issues in practice. In the past I've tried it thinking "it's basically an EBS volume I can mount on > 1 EC2 instance," only to find terrible read performance and misc. low-level NFS errors.
Dunno your exact requirements or when you last tried it, but they did boost EFS's read speed (they claim by 400% [1]) as of this April, so it might be worth looking into again if you're still trying to find a solution.
You should try one of the various netapp options. Marketplace but significantly better performance and a platform that’s got several decades of nas experience.
I found it adequate to work around limitations of another system I was using. We were using a container coordinator thing (Convox) that couldn't attach EBS volumes, nor could it limit concurrent access to exactly one replica. So I used EFS which worked OK. I kept an eye on how we were using the burst credits, and picked a filesystem size that gave us enough IOPS. All in all, it worked fine (but I was perfectly happy to move off of it to EBS, of course).
Serverless is a bit like Stone Soup [1]. This I guess is the point at which the Tramp says: "Now if you just add a few onions it really helps the flavour..."
I recall Joyent's solution to this (similar) problem where you have an object stored somewhere (e.g. S3) and you want to use that object in a container, but you have to copy it over HTTP or something to do any work on it and the object could be very large.
With Joyent's Manta[1] you would spin up a container right where an object is stored (instead of bringing the objects to the container via NFS.) Also has map reduce support.
Ummm. Linux’s NFS client includes a kernel page cache.
You can just mmap or read the file without doing anything else. That is zero or one memcpy overhead.
S3 clients have to copy the data over the network, assemble the tcp packets, decrypt and checksum for ssl, and then memcpy the result. That’s at a minimum. They may be doing other work, like verifying the s3 checksum, or allocating memory to store the object.
They have to do that once per lambda process, again, at a minimum. They might do it once per lambda invocation.
I wonder how amazon bills DRAM if multiple lambdas mmap the same thing read only.
Ummm, I'm pretty sure that before data from remote NFS server make it into kernel cache they too have to be copied over network, assembled from TCP packets, possibly decrypted (k5p) and verified (k5i) with NFS over Kerberos (otherwise you would have no confidentiality/integrity), and moved into newly allocated memory. Sure, once it is in kernel cache and data are not modified there may be just "Is this handle still up to date?" remote calls but you could achieve the similar cache with object storage.
This is a horrible idea. This gives lambda functions shared mutable state to interfere with each other, with very brittle semantics compared to most databases (even terrible ones).
Depending on the use case it can be a good option (some are listed in the article). If the dev doesn't get the disadvantages the rest of the architecture will hardly be correct anyway
I need this specifically as a part of a state machine. Most of the steps involve a Lambda loading and unloading csv data between S3, Redshift, and Aurora where no local storage is needed. The last step where we had to download the files locally and compress multiple files together was done manually via a script because they were greater than 512Mb.
We were just about to put the script in Fargate (Serverless Docker) and run a ECS task as part of the state machine. Now we don’t have to.
This - if you have to fetch data from or output data to outside of the AWS ecosystem, the 512mb /tmp limit pushes you into the additional (relative) complexity of having to run on Fargate pretty quickly. Just had to deal with this for a content ingest job involving pulling a couple GB of data from an FTP server, processing it and pushing it into an RDS database on an hourly basis. Would have been super simple if the file was on S3 already.
Lambda only gives you 512MB pf local “scratch space”. If it had to provision and de provision gigs of space at each invocation, it would probably cause longer start times and shut down times.
I worked with EFS but not lambda in 2017-2018 when migrating an app to AWS - an app which included a bunch of random application code that assumed it could read or write into a network file system. Having EFS as a migration target to replace on prem CIFS was relatively pleasant, which removed the need to rewrite a bunch of the application code. S3 would have been a reasonable replacement but that would have required weeks or months of rewrites to hunt down filesystem calls and rework them to use simpler object store API.
One thing that tripped us up at the time was EFS not supporting encryption in transit: but this was fixed in early 2018 when EFS began supporting using stunnel to wrap the underlying NFS connection in TLS. https://docs.aws.amazon.com/efs/latest/ug/encryption-in-tran...
It reads as if this lambda integrated EFS works out of the box with encryption in transit
I’m genuinely curious about the factors that made it OK to have an unencrypted protocol into 2018. Is the AWS infrastructure already encrypting at another layer? Nobody worries about attacks on data that’s only transiting AWS?
AWS has stated that internal traffic, at least within a VPC, is "tamper proof". I don't recall any specific details about that, whether it is the case across regions, accounts, etc.
back when i was working on this (for a large org) they regarded anything in AWS as fundamentally lower trust than their own on-premises corporate networks, so there was a security requirement to do encryption in transit for all TCP connections -- this was for an application within the VPC with no public traffic
It's not so much about trusting AWS not to pry into your data. More that the attack surface of your data as presented to AWS' other—potentially hostile—customers is (presumably) significantly reduced by not having the bytes flying around AWS' infrastructure in the clear.
To put it mildly, Amazon has a lot more folks a lot smarter than me thinking about these tradeoffs, so I assume they had good reason to think it was fine the first time around. I'm just surprised that the state-of-the-art for cloud-hosted services doesn't presume building on TLS from day 1, so I'd love to know what those reasons are.
I think Google's internal network was unencrypted in 2013 [1], allowing the NSA to tap it. After the Snowden leaks showed that they were being tapped, Google moved to encrypting data on their internal links. I wouldn't be surprised if Amazon made a similar decision.
Right, but this is in the context of AWS, where Amazon has got thousands of customers running their own code in its data centers. I'd (naively?) assumed that any novel protocols to ship customer data back and forth between hosts in that environment would have been built on TLS from day 1.
OMG THIS IS SO AMAZING I HAVE BEEN WANTING THIS FOR AN ENTIRE YEAR NOW. (I've been using Lambda to do massively distributed compile jobs, but had reached the throughput limits I could achieve with distcc-like techniques doing local preprocessing, and so was looking at doing limited synchronization of my codebase to S3 to then either link against the compiler or, for other tools and to let me use the gold standard compiler I want, do C runtime injection to make it so that when files are opened I pull them from S3... but that entire process sucked and doesn't really solve the general purpose problem: this does; this lets me trivially do the moral equivalent of make -j1000 and have all of the random sub-jobs get executed in lambda functions and have the compile complete nearly instantaneously. I can even have those jobs just directly share state and do "exactly what you'd expect" with respect to the inter-dependency stuff <- which like, is a tradeoff, but one that fits well with how most projects are already designed when using make... I'm so pumped to go back and work on that project again.)
It wasn't obvious to me if this is somehow mounting EFS over !NFS (since never says the words NFS in the post). My main fear when people say "Should I use Lambda / Google Cloud Functions / Cloud Run against my NFS server" my response isn't "How would you set that up" it's "Be careful. Cleaning up NFS locks held by clients that have gone away is fairly painful, and you have none of the mechanisms to make sure it exits properly".
Alternatively, you can mount without locking, and then you get one of the comments downthread about "and now you've given functions shared mutable state but with bad primitives".
tl;dr: Cool! ... But, how does this handle NFS locking?
This is NFS and locking is fully supported (the blog includes an example). Because EFS implements the NFS 4.0/4.1 protocols, locks are lease-based and there isn't a need to clean them up. In the unlikely event of a client crash where the client still held locks, they will automatically expire once the associated lease expires.
Is anybody using Lambda to run huge MapReduce jobs? Do people still use Hadoop?
Doesn't this basically just let you have something like HDFS for running large distributed computations with some shared state, without having to reach for S3 or redis?
I know at least one team at Amazon that does runs map-reduce style jobs with Lambda, but now that Athena supports user-defined functions, I'd personally be inclined to use it instead over EFS or S3 + Lambda.
We use Lambda with S3 as intermediate storage between different steps, sort of multiple-stages map-only MapReduce. And we still use Hadoop on premise :)
sharp knife to hand people -- because EFS is just NFS, it uses NFS for security / isolation. Everything that can mount a given volume needs to agree on what unix users are what, and you need to make sure to completely lock down root access, otherwise you can't enforce any kind of data isolation.
If your use case can deal with one EFS volume per isolation boundary, you can use IAM to control who can mount what volume, which might be easier to reason about.
The EFS/Lambda integration uses EFS Access Points, which allow you to enforce a specific POSIX identity and directory for NFS operations. You can also use IAM policies to require that specific IAM roles/users use a specific access point.
I stand corrected. I kind of knew that since I've been struggling with identity and access control on a home network, and have been trying to muddle through so far without setting up kerberos. I know there are people who have no trouble setting this stuff up, but I'm not one of them.
There is certainly some additional complexity here over the basic Lambda serverless setup. But I think the console-driven configuration as outlined in these posts often makes it harder to see what the core concepts really are.
With infrastructure-as-code tools, this can be a little clearer. At Pulumi we wrote a post earlier today on configuring the infrastructure needed to use EFS with Lambda, and it boils down to just a few concepts and a couple dozen lines of infrastructure code.
Some of the complexity here also comes from the fact that EFS is a general purpose managed NFS service, instead of a fully-abstracted Lambda-specific feature. That does add a little additional up-front complexity, but means you can use EFS across all sorts of different compute in AWS - not just Lambda.
Ooh. I wonder if efs is compatible with sqlite’s nfs mode.
More seriously, this is huge. Unix pipes over shared nfs has always been my big data platform of choice (since before the cloud, or even google map reduce). Things finally came full circle.
I agree that we will see all of those, except the last one. It's no surprise that we're bending around, that's normal. But when we get back to the "feature parity with the past" things will still look very different. We're talking about fully managed systems here that you can compose together to get parity with what you have today with unmanaged systems.
I also don't think the time limit on lambdas will extend much further than it is today. Not sure on that one.
I'm imagining the full circle looking like web frameworks "compiling" their controller methods down into lambda functions. e.g. you deploy your Django application to AWS Lambda and it automatically fires a lambda function when a view method is hit via a route. This is already happening where people are deploying static sites backed by lambda function APIs.
> e.g. you deploy your Django application to AWS Lambda and it automatically fires a lambda function when a view method is hit via a route.
Is that a bad thing in your opinion? Examples like that were very much a Day 1 use case, or at least intent, for a lot of AWS Lambda. When we were working on CloudFront Lambda@Edge it was explicitly the intent to move the compute out of the traditional data center or EC2 instance. We wanted to enable customers to run their entire application in this fully managed 'serverless' aspect where AWS can optimize along multiple axes on their behalf.
Internally Amazon is a land of APIs, RPC, and federated services/business logic. Getting to the point where each API action, or internal method, would be hosted as a Lambda was entirely a goal.
Source: Principal at AWS. Not in this space directly anymore, but spent time working on CloudFront & Lambda@Edge.
I suspect they called out block storage as that's normally what you'd use for reasonable performance from a DB engine. As in that'd be the next step 'down the stack', not what was released today.
This is misdirection by naming, like how University of Phoenix is similar to Arizona State University, Phoenix. "Lambda functions" sounds like anonymous functions, but is actually referring to a proprietary interface to AWS Lambda™. They named it this way so that readers confuse AWS Lambda™ for programming lambdas.
If anyone is initially fooled by the naming that illusion is broken very quickly after just a few seconds of looking into it so I doubt that this was an intentional act to trick people
Lots of companies do what Amazon does. See University of Phoenix vs. ASU Phoenix campus.
Is it stupid? Yes, but advertising is generally geared toward human stupidity. More specifically, it's geared toward pointy-haired bosses who are being presented AWS Lambda™ who are thinking to themselves "well I heard a lot about lambdas, are these what my team meant?"
Can we please change the title? I was really afraid that AWS might have lost it and called the new file system "new", which must be the least practical name for anything from an SEO standpoint.
I second this motion. I was going to make the title more sensible when I submitted it, but given this is discouraged and often rolled back nowadays (as I often see on https://hackernewstitles.netlify.app/), I decided I'd let it happen in due course.
"lambda functions" not "lambda expressions" - It is funny (or not) how Amazon redefines how a term is used by naming a product. The words actually make no sense, but people do not care and say It anyway "lambda function".
To me it sounds like you say twice, that you have a procedure or lambda.
A lambda ("expression" is often not even said, instead simply "this lambda here ...") in many programming languages is already an anonymous procedure and often an anonymous function. So saying "lambda function" makes it sound like "procedure function" or "function function" or "lambda lambda".
I tried Lambda for a use-case that I had in 2018:
We published Polls and Predictions to people watching the 2018 World Cup. We set the vote callback URL to a function on AWS Lambda.
It failed spectacularly during our load-testing because the ramp-up period was far too slow. We needed to go from 0 to 100,000 incoming requests/second in about 20 seconds.
We had to switch to an Nginx/Lua/Redis endpoint because Lambda was just completely unusable. It would have cost us $27,000/month to pre-provision 10,000 concurrent executions...
What is it that people actually use Lambda for?