Five Reasons for Doing Drupal Development on a VM
In years past, I used to do my development on a local machine, and then push my work to a remote server for testing. About two years ago, though, I switched my environment. I began using virtual machines instead of physical servers. Configuring them for Drupal, I could do my Drupal development locally, and then do advanced testing on my virtual machine.
In this article, I give five reasons why I believe Drupal development can be enhanced through using VMs. <!--break-->
1. Replicate the Real Server(s)
What I am advocating is using a VM that acts like a (production) server. If the server has only one site on it, the VM should have only one. If the server hosts six sites and you are developing on all six, the VM should also host all six.
Why all the bother? Why not just create a single VM for all projects and toss them all in?
Because the idea is to make the VM as close to production as possible in order that you catch the same sort of bugs and nuances that are going to crop up on the server. Many configuration nuances can be caught and dealt with in this way. Bugs that show up on, say, a Linux server but not a Mac server can be caught earlier when you run your tests on a Linux server instead of the local workstation OS.
Here are the things I tend to try to replicate:
- The host OS: I try to get exactly the same version
- The app stack: I try to install the same versions of Apache, PHP, MySQL, and so on.
- Custom configurations: To unearth those config file bugs, I try to copy php.ini and the relevant Apache configs, too.
- Supporting libraries: PEAR and PECL packages play a crucial role in PHP development. I install the same ones the server has.
Things I sometimes try to replicate:
- RAM or disk space: On apps where memory or storage plays a crucial role, I try to mirror this in the VM. Normally, though, I set these properties based on host OS limitations.
- Other servers: Sometimes I need to test out database replication, multi-server caching, or proxying. In such cases, I sometimes set up other VMs to mimic these facilities. Sometimes I will configure a VM to do both parts (e.g. running Varnish on my VM, even though in production Varnish has its own server cluster). Typically, though, I do these only as I need to.
2. Catch Stealth Dependencies
In the fabulous book The Productive Programmer, Neal Ford discusses the problem of software dependencies that "creep into" a project through the IDE or other workstation tools. The problem, in a nutshell, is that developers tend to install all sorts of things on their local workstations -- many of which are not directly related to the software being produced. Yet once they are installed, developers will use them.
What happens when a developer has some PEAR package installed, uses the API, and then deploys the resulting code to the production server (without first installing the same PEAR package)?
The developer may not be overly lax in accidentally pushing this code to a server. After all, he or she may not even realize that this dependency has been added. The function was just there! All the developer did was use it!
Drupal sometimes makes this problem worse, as modules do not have a way of uniformly declaring a dependency on a PEAR or PECL package. It may "just work" on a local workstation that already has the library, and perhaps totally fail in production.
Using a VM for development will help catch those stealth dependencies (after all, the code won't work on the VM until you install the necessary libraries, at which time you file a ticket to install those dependencies on production). It can also help identify subtle differences between different versions of core app software (like Apache or PHP).
3. Isolation of resources
Sometimes I crash my servers. I'm glad that that no longer crashes my workstation. Sometimes I configure Varnish to eat up all my memory. I'm glad that only eats up all the VM's memory. Sometimes I get MySQL thrashing.
Since the VM's resources are isolated from the underlying OS, I can still work on my machine even if the server is pulling a massive DB load. And I can kill the server if I have to -- without the inconvenience of restarting the workstation. In fact, it's nice to be able to run a kernel upgrade without having to drop out of IRC. The two are separated.
4. Easy rollbacks
Most VM software provides tools for "snapshot and rollback." Take a picture of the current state of the VM. Then, at a later time, revert to that snapshot.
I've found this tremendously useful for various benchmarking tests, where I can reliably reset test environments.
Of course, it's also a nice feature when an install goes haywire, a database corrupts, or you accidentally run rm -rf /bin
. (Oh, yes I did!)
Finally, all of the VM layers I have used recently allow the creation of images from which clone VMs can be built. I've used this to create a base system and share it with other developers or quickly ramp up a new project.
5. Mimic Deployment and Other Maintenance Tasks
The final nicety is that VMs provide a way to test deployment and maintenance tools. Often, deployment scripts or configurations will be written for just one task: Push from staging to production. Few eyes see the code, and since the code is run relatively rarely, bugs often remain unnoticed over several releases (especially on a large site where small problems don't always present themselves in an obvious way).
Use Drush to deploy to production? Then configure your host and VM for Drush deployment and run the same procedure. Run a custom script? Retool the script to work on the VM, too. It's better to find out about flaws during routine development than during the 2AM maintenance window.