Microbenchmarking PHP: Switch Statements are Slow

Aug 18 2011

There are very good uses for Switch statements. They can be great for nesting logic. But sometimes they are used as a way to (essentially) map a name to a value. In such a scenario, the body of the case is just a simple assignment.

After noticing this pattern a lot recently, I thought I'd benchmark it against the obvious replacement candidate: a hash table lookup. PHP arrays are more like ordered hash tables. For that reason, they provide very fast random access. Are they faster than a switch statement?

I ran two tests. In the first case, I assumed no default. In the second, I assumed a default. The conclusion: Arrays are faster. Read on for the details. <!--break-->

Benchmarking Switch vs. Array index

I took the basic case from what I see as the most typical use of a switch statement: About five different case statements, each of which has a distinct value. In the first test, there is no default case.

No default case:

  <?php

  $iterations = 10000;
  $options = array('apple', 'banana', 'carrot' ,'date', 'endive');
  $color = NULL;

  // Fill an array with random keys. This ensures
  // that (a) we use the same keys, and (b)
  // slowness in the randomizer doesn't impact the
  // loops (which can happen if entropy collection kicks in)
  $samples = array();
  for ($i = 0; $i < $iterations; ++$i) {
    $samples[] = $options[rand(0, 4)];
  }

  // Test a switch statement.
  $start_switch = microtime(TRUE);

  for ($i = 0; $i < $iterations; ++$i) {
    $option = $samples[$i];
    switch ($option) {
      case 'apple':
        $color = 'red';
        break;
      case 'banana':
        $color = 'yellow';
        break;
      case 'carrot':
        $color = 'orange';
        break;
      case 'date':
        $color = 'brown';
        break;
      case 'endive':
        $color = 'green';
        break;
    }
  }
  $end_switch = microtime(TRUE);

  $total_switch = $end_switch - $start_switch;
  printf("Switch:\t%0.6f sec to process %d" . PHP_EOL, $total_switch, $iterations);

  // Test an array lookup.
  $start_map = microtime(TRUE);
  $map = array(
    'apple' => 'red', 
    'banana' => 'yellow', 
    'carrot' => 'orange',
    'date' => 'brown', 
    'endive' => 'green'
  );
  for ($i = 0; $i < $iterations; ++$i) {
    $option = $samples[$i];
    $color = $map[$option];
  }
  $end_map = microtime(TRUE);

  $total_map = $end_map - $start_map;
  printf("Map:\t%0.6f sec to process %d" . PHP_EOL, $total_map, $iterations);
  ?>

https://gist.github.com/7232042a82c8975d7a16

I compare two methods.

First, I test the basic switch. Again, the point of comparison is evaluating an average case of using switch to do assignments.

Second, I test doing the same lookups against a PHP array, which is very quick at random accesses of keys it works like a hash table.

Results

I ran numerous iterations of the test, and the average indicates that using a PHP array is about twice as fast as using a switch. Here's the output a representative run of the script above:

  Switch: 0.004895 sec to process 10000
  Map:    0.002009 sec to process 10000

Benchmarking with a default value

But wait! One nicety of a switch statement is the ability to set a default. The first benchmark doesn't measure that. Let's try it and see if the array approach still wins.

Default case:

  <?php

  $iterations = 10000;
  $options = array('apple', 'banana', 'carrot' ,'date', 'endive');
  $color = NULL;

  // Fill an array with random keys. This ensures
  // that (a) we use the same keys, and (b)
  // slowness in the randomizer doesn't impact the
  // loops (which can happen if entropy collection kicks in)
  $samples = array();
  for ($i = 0; $i < $iterations; ++$i) {
    $samples[] = $options[rand(0, 4)];
  }

  // Test a switch statement.
  $start_switch = microtime(TRUE);

  for ($i = 0; $i < $iterations; ++$i) {
    $option = $samples[$i];
    switch ($option) {
      case 'apple':
        $color = 'red';
        break;
      case 'banana':
        $color = 'yellow';
        break;
      case 'carrot':
        $color = 'orange';
        break;
      case 'date':
        $color = 'brown';
        break;
      ##case 'endive':
      ##  $color = 'green';
      ##  break;
      default:
        $color = 'green';
    }
  }
  $end_switch = microtime(TRUE);

  $total_switch = $end_switch - $start_switch;
  printf("Switch:\t%0.6f sec to process %d" . PHP_EOL, $total_switch, $iterations);

  // Test an array lookup.
  $start_map = microtime(TRUE);
  $map = array(
    'apple' => 'red', 
    'banana' => 'yellow', 
    'carrot' => 'orange',
    'date' => 'brown', 
    //'endive' => 'green'
  );
  for ($i = 0; $i < $iterations; ++$i) {
    $option = $samples[$i];
    if (isset($map[$option])) {
      $color = $map[$option];
    }
    else {
      $color = 'green';
    }
  }
  $end_map = microtime(TRUE);

  $total_map = $end_map - $start_map;
  printf("Map:\t%0.6f sec to process %d" . PHP_EOL, $total_map, $iterations);
  ?>

https://gist.github.com/32f43f6c3e0abbfe40fe

Results

Unsurprisingly, the additional if/else and isset() added a little bit of time to the array-based method. But not enough.

Somewhat more surprising was the fact that changing one value from a case to a default made the switch seem slightly faster. Over the numerous runs I did, the switch was consistently faster with the default than it was without. Usually, it wasn't a lot faster (it tended to hover around 0.0038), but... it was faster. Yet I never produced a case where the switch was faster than the array lookup.

  Switch:   0.003257 sec to process 10000
  Map:  0.002677 sec to process 10000

Commentary

Matt Farina suggested that the test isn't totally fair. The $map really should be declared outside the timer, as that more accurately reflects the fact that the switch is parsed and built outside of the timer. That might offer a tiny performance boost to the array version.

My aversion to using switch statements to assign values has always been at a higher level. I don't like the fact that they take up both horizontal and vertical space. They look ugly, and they provide more "brain overhead" to read.

Switch statements also seem to be somewhat error prone. Developers sometimes forget break statements, which can sometimes result in hard-to-locate bugs.

For the record, when it comes to switch vs if/elseif/else, the two are about the same: https://gist.github.com/d1fe59a23daa33aaf6fe (Note that I tested against a very small set of options, as that is the common use case.)

Finally, I will state once again that I am not claiming that switch statements are no good, worthless, or should never be used. Rather, I'm pointing out that for a very common set of circumstances switch statements should not be used.



comments powered by Disqus