What is serverless - Episode 2: Scaling your application

Written by Edward Hewitt in Engineering on April 24,2019

In Episode 2 from our Serverless series, Sadek, from Prismic, and Guillermo, from Zeit, discuss how Serverless allows you to scale your application both when you experience peaks of traffic and when you only have a few people coming to your app.

If you missed Episode 1, make sure to check out the full playlist so that you can see everything that Sadek and Guillermo had to say about Serverless. It's a very full in-depth discussion, so it is worth watching the full series.

Highlights:

  • Why Serverless usually gets usually associated with scalability: independent resource allocation for concurrently occurring requests.
  • What would happens with a Serverfull model and an example of image optimization.
  • Serverless functions: the concurrency model always scales: ie. each invocation is independent from another.
  • The challenge of initial loading time: Making your code instantiations fast by bundling your code with its dependencies at build time.
  • Example - Next.js multipage website deployment. Pages scale independently

Sadek: Guillermo, so in the first video we introduced a bit serverless. You talked about what it is, and I thought we now can talk about scalability of serverless and why is it scalable.

Going back to the fundamental principle, when a request comes in, your code gets executed and you're not writing the code for the server behind it.

Guillermo: Yeah, I mean that's one of the things that always gets associated with serverless. It's a more scalable paradigm. So I think it has to do with a couple things. The first one is the concurrency model of serverless. So it's this idea that, going back to the fundamental principle, when a request comes in, your code gets executed and you're not writing the code for the server behind it.

If a request comes in at the same exact time, the two requests are not going to go to the same instance of your function

So you're writing only the code for when the request comes in. So, if a request comes in, a function is instantiated and the code is run. If at the very same time, another request comes in because it's an API or you're doing HTML rendering. Or whatever, if a request comes in at the same exact time, the two requests are not going to go to the same instance of your function. Concurrently, a new function instance is going to be created. So, what this does, is that the resources allocation are completely independent for each one of your incoming requests and your incoming customers.

Sadek: Or at least if people develop it that way, you can make them completely independent?

Guillermo: Correct. So, with servers, what would happen, and we've seen this countless of times, is let's say that you create a server to do something that uses a lot of CPU, like convert an image from JPEG to PNG, the developer can be very successful with a serverfull model, because you create a server, like, for example, express.js or any other framework for servers and you create your routes and your route invokes ImageMagick, for example, for converting the image.

So, and you test it, and you're like Oh, like I boot it up, boot up my server, localhost:3000

Sadek: Works on my machine.

Maybe with as few as three or four requests, you didn't contemplate that your server resources are now shared between all your incoming requests.

Guillermo: It works correctly on my machine. And it converts the image. Great. And then you launch it. And maybe with as few as three or four requests, you didn't contemplate that your server resources are now shared between all your incoming requests. And I think when you're dealing with databases, you may not hit this problem sometimes, until you get more load, or you might not hit it immediately, but with things that are a little bit more CPU-heavy, the problem hits you kind of immediately.

Sadek: Right.

Guillermo: Your server just breaks down entirely. With serverless functions, what ends up happening is, whether you have a lot of traffic or not, you have a concurrency model that always scale, because each invocation is independent of one another when they're happening at the same exact time.

Sadek: But the problem here is does it mean that you have to start up the function each time you get the request?

Guillermo: Yes.

Sadek: Which is loading time, right?

Guillermo: Yeah, so there's this hot and cold problem that always gets discussed and there's no escaping this problem. And I think one has to embrace that things can be cold, because as you correctly point out, you might try to pre-warm a function or something like that, but then, any concurrent one is going to be also instantiated from scratch every time. So the solution that actually scales really well is make your cold instantiations really fast.

Sadek: How can you do that?

Guillermo: This is possible. So, the way that we do it on our platform is all your entry points into your code, all your JS files, for example, become discreet functions. So, for example, you may have a directory called /api and you say "Hey, I'm going to create /api/users.js". So, you can very easily tell our system, "Hey, this is a function. It uses Node.js, or it uses this Typescript, or use as Go." And we built only that as a function, instead of trying to bringing your entire repo into the function. So when JS doesn't have to load a lot of code, what happens is JS tends to boot up really fast.

Sadek: So you break it into smaller bits?

What our platform does is at build time, we run the equivalent of the bundling process that developers send to the web browser. So, your function gets not just your code, but all its dependencies sort of bundled together.

Guillermo: Yeah. And giving the developer the framework for not falling for the trap of putting a lot of stuff inside a function, but also things like, for example, everyone knows that Node modules gets really really big, so what our platform does is at build time, we run the equivalent of the bundling process that developers send to the web browser. So, your function gets not just your code, but all its dependencies sort of bundled together.

And what's interesting as well is we don't bundle your developer dependencies, your test dependencies. It actually makes your code a lot more secure. You don't ship things accidentally. It makes it faster. You can't even tell that a cold boot is happening. The interesting thing is developers have been worrying about this, because you worry about how big the JS is...

Sadek: Yeah, of course. It's a very important thing when you give it to a web browser.

Guillermo: Of course. So that's what I was saying. It's kind of the same model. That's what's beautiful about it is we're back porting all the best practices of cold boots that we've given web browsers.

Sadek: Right.

Guillermo: And we know that certain code boots tend to be slower when you give it to Android on 2G. So, I think this problem is solved. It's just that we now have to carry this practices... And make them universal.

Sadek: Right. Cool. And so if I have an app, imagine, and I have different pages, can I consider that each page is a function?

Guillermo: Correct. So, when you deploy Next.js, we make sure that each of your pages that describes a component, a React component tree, and they are the entry points into your application, each of these pages gets seamlessly compiled into serverless functions. So, you're not really even changing your code.

Sadek: Nice. And but that means also that if there is a page that is unpopular, it's not taking any memory in any function.

All the pages, as they get accessed, they boot up on demand. And they all scale independently as well.

Guillermo: Correct. So, all the pages, as they get accessed, they boot up on demand. And they all scale independently as well. So, you might have pages that are more popular, so you're going to see a lot more invocations of that function.

Sadek: Right,

Guillermo: That means also scaling up but also scaling down.

Sadek: Yes. If your website got some peaks, it scales up, but if you...

Guillermo: Yeah. We see this all the time with, like people's websites might be more popular in certain countries, in certain times of the day, and serverless just handles that all naturally.

Sadek: Right. Cool. Well, on the next video, we'll talk more about deploying globally.

Guillermo: Cool.

Edward Hewitt

Content Strategist. If the devs have their way, Edward will one day be replaced by a Prismic feature.