The 5 Layers of PaaS
Ask a cloud-savvy developer what PaaS is, and you will get an answer like this:
A PaaS is a cloud service that lets developers deploy applications into the cloud without having to manage the underlying infrastructure layer.
A year or two ago, PaaS systems were monolithic. A single vendor or solution, like Heroku, would provide one system that handled all aspects of PaaS. But things are changing. With a plethora of Open Source tools like Docker, Packer, Serf, CoreOS, Dokku, and Flynn, it is now possible to build your own PaaS. But what exactly makes up a PaaS?
I will take a functional approach to defining PaaS by asking what are the things that a PaaS does?
PaaS can be viewed as a workflow with several functional phases. Each phase accomplishes a specific goal in the process of moving an application onto a production platform. The phases are not necessarily serial steps. They may run in parallel, and not in the order listed below.
The five functional phases of a PaaS are:
- Lifecycle management
- Service management
The deployment phase is responsible for moving an application from its source (typically a developer's machine) to the PaaS. Some of the common ways of doing this include:
- Running a
gitremote on the PaaS and handling
git pushevents from clients. (Heroku, OpenShift, Flynn and Dokku all use this method. Elastic Beanstalk uses a variation on this.)
- Sending the code as a bundle (often a gzipped tar). Cloud Foundry uses this method, as does Stackato.
- Compiling the code locally and copying the resulting executable to the PaaS.
When a PaaS receives a deployment, it kicks off processes to move that app into a running state. The exact order of those processes varies, so I will keep them in the order in which they appeared above.
In the provisioning phase, the PaaS sets up the infrastructure necessary for running the app. "Infrastructure" is a broad and sometimes nebulous term, but here are some common provisioning targets:
- Setting up containers and/or compute instances
- Configuring networking
- Installing or configuring operating system services (e.g. Apache)
- Installing or configuring libraries (e.g. Ruby Gems)
Many PaaS systems spread provisioning responsibilities across multiple tools. One tool may create a compute instance, while another tool may install libraries. But all are sharing the same responsibility: create the environment in which the application will run.
3. Lifecycle Management
Once the PaaS has a copy of the app as well as an environment capable of running the app, it needs to manage the execution of the app. This is lifecycle management. Common tasks of lifecycle management include:
- Starting the app
- Monitoring the app's running state
- Monitoring or reporting on the app's resource consumption
- Restarting an app upon failure
- Stopping or restarting the app on command
Some minimal PaaS systems offer only basic lifecycle management (e.g. start and stop), while highly sophisticated ones may include autoscaling, auto-throttling, and hot (zero-downtime) deployments.
4. Service Management
This phase is not one that all PaaS layers perform. In fact, I would go so far as to say that it is not a mandatory piece of PaaS, But it certainly is useful when present.
All PaaS systems run applications (that is, after all, what they're for). But some go a step beyond and provide services that may be attached to an application. These services run outside of the application container or compute instance.
Services might include:
- Networked file systems
- Message queues
- Aggregated logging
"Old guard" systems (like Cloud Foundry) share a service (e.g. MySQL) across multiple applications. Some of the newer container-based approaches like CoreOS may supplant this model by making it simpler to run services in specially-designated containers. (Check out the Serf project for a similar approach.)
Why don't all PaaS systems need this layer? One reason is that many cloud providers already have comparable services in the form of DBaaS, MQaaS, and so on.
5. Reporting and Monitoring
This final phase is the most banal. Most of the application's lifecycle is not spent on deployment or provisioning or service management. It's spent running. During an applications life, there are many interesting things that can occur. There are lifecycle events that we'd like to know about, like restarts. There are environmental conditions of interest, like resource utilization and system performance. And, of course, there is application data that we would like to monitor, like log files and application metrics.
Many, but by no means all, PaaS platforms provide at least some level of reporting. Here are some examples:
- Amazon Elastic Beanstalk integrates with AWS Cloud Watch, and also aggregates system log files per application.
- ActiveState Stackato provides a web console with copious logs, and can show real-time statistics about an application and its surrounding environment.
- Heroku can optionally send events to a Loggly backend (which is a service).
Conclusion: PaaS and Mini-PaaS
As we've seen, each functional phase of PaaS can be done to greater or lesser degrees of complexity. Old guard PaaS systems often come feature-packed. But with PaaS building blocks like Docker, Flynn, and CoreOS, building a special purpose tailored mini-PaaS is not out of the question. Just take a look at Deis and Dokku for solutions with varying degrees of complexity.
Technologies in alphabetical order:
- Cloud Foundry
- Cloud Watch
- Elastic Beanstalk