a++: Using PHP strings in unusual ways
Here's a PHP question for you: When the following code is executed, what will the output be (or will the code fail)?
<?php
$char = 'a';
print ++$char . PHP_EOL;
?>
The answer: The above will print a single line.
b
The reason for this is that an increment operation on a string increments the letter. In short, it treats strings (in this context) more like a char
type as in C or Java. Well, it treats them sorta like a char
. PHP has some special... behaviors when it comes to strings and incrementing operations. In this article, I'll show some of those behaviors.
<!--break-->
Expanding on the first example, what would we expect to happen when we execute this?
<?php
$char = 'z';
print ++$char . PHP_EOL;
?>
Does it produce an error? Or print a '{'
(the next sequential ASCII char)? Nope -- neither of these. The code above will print aa
.
Why is this? When we begin with a letter and increment, once it passes z
it will continue to aa
, and then ab,ac, ad
, and so forth. To illustrate, we can do something like this:
<?php
$char = 'a';
while ($i < 1000) {
printf('%d. %s', ++$i, ++$char);
print PHP_EOL;
}
?>
This will print 1000 characters in sequence, beginning with b
, and ending with alm
.
Here's an excerpt of the output:
1. b
2. c
3. d
4. e
5. f
6. g
7. h
8. i
9. j
10. k
11. l
12. m
13. n
14. o
15. p
16. q
17. r
18. s
19. t
20. u
21. v
22. w
23. x
24. y
25. z
26. aa
// Near the boundary from two to three characters
698. zw
699. zx
700. zy
701. zz
702. aaa
703. aab
704. aac
705. aad
// And the end...
995. alh
996. ali
997. alj
998. alk
999. all
1000. alm
The example above illustrates the way a character responds to an increment operation. And it's not just individual characters that respond do incrementing. You can increment just about any set of alphabetic characters:
<?php
$char = 'aa';
print ++$char . PHP_EOL;
$char = 'matt';
print ++$char . PHP_EOL;
?>
This will print, first, ab
, and then matu
(in the later case, it increments the t
to a u
). Even spaces in strings don't seem to impact the incrementing. matt butcher
will become matt butches
.
Interestingly, decrementing characters this way does not work (though we will see another way of accomplishing something like it in a few moments):
<?php
$char = 'z';
print --$char . PHP_EOL;
$char = 'a';
print --$char . PHP_EOL;
?>
The above will print z
and a
, respectively. In other words, the decrement operation seems to have no effect on characters.
Looping with foreach and for
If you can increment characters, it seems likely that certain other numerically oriented PHP tools also work with characters. One that springs immediately to mind is the range()
function.
The range()
function takes a starting value and a ending value, and generates an array that begins with the first value ends with a finishing value, returning all appropriate values in between.
Here's a simple example:
<?php
$array = range(1, 12);
print_r($array);
?>
Running the code above will dump an array whose contents look like this:
<?php
Array
(
[0] => 1
[1] => 2
[2] => 3
[3] => 4
[4] => 5
[5] => 6
[6] => 7
[7] => 8
[8] => 9
[9] => 10
[10] => 11
[11] => 12
)
?>
So range generated an array whose values began with the start (1), in increments of one, up to the ending value (12).
Does something like this work with characters? Yes.
<?php
foreach (range('a', 'z') as $char) {
print $char . PHP_EOL;
}
?>
As you might expect, this will print twenty-six lines, beginning with a
and ending with z
.
Incidentally, you can accomplish the same thing with a standard for
loop:
<?php
for ($i = 'a'; $i < 'z'; ++$i) {
print $i . PHP_EOL;
}
?>
This uses the more traditional method of incrementing using the ++
operator.
However, using strings inside of these control structures exhibit some different behaviors than the previous examples would have suggested:
<?php
foreach (range('a', 'aa') as $char) {
print $char . PHP_EOL;
}
for ($i = 'a'; $i < 'aa'; ++$i) {
print $i . PHP_EOL;
}
?>
Based on earlier examples, we would expect 27 lines of output, beginning with a
, passing z
and ending with aa
. Instead, what we get is just this:
a
a
(That's one line for each of the two loops.)
Inside of these loops, it appears that only the first character in a string is evaluated.
Fun with range()
There are some surprising aspects to the evaluation done by range()
. First, range can perform descending as well as ascending alphabetic ranges. For example, we can easily go from z
to a
:
<?php
foreach (range('z', 'a') as $char) {
print $char . PHP_EOL;
}
?>
This produces the predictable z
, y
,... a
sequence.
But range()
seems to operate on the ASCII numeric values of characters (instead of following the string-ish behavior we saw above). So you can elicit more exotic ranges:
<?php
foreach (range('A', 'z') as $char) {
print $char . PHP_EOL;
}
?>
This produces all ASCII characters between ASCII A and ASCII z:
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
_
`
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
Notice that after Z
, our loop goes through several non-alphabetic characters, beginning with [
, and going through `
. The the loop iterates over the lowercase letters, a
to z
.
So what's the deal with those six non-alphabetic characters in the middle? Anyone who has worked with C or other lower-level languages may recognize what is happening. The range()
function is returning an array of ASCII characters in their numeric order.
That is, in the ASCII character set, A
is represented by the decimal (base-10) number 65. Z
has the value of 90. Keep in mind that those are for the capital letters. Lowercase letters begin with 97 (a
) and end at 122 (z
). You may notice the gap between Z
and a
. In the ASCII character set, that gap is filled with [, \, ], ^, _,
and `
. (Other special characters appear either before 65 or after 122.)
If you have used the ord()
and chr()
functions in PHP, this should make sense. The ord()
function takes a character and returns its ASCII decimal value (e.g. ord('A') == 65
). The chr()
function takes a number and returns its ASCII character value (e.g. chr(65) == 'A'
).
The range operator does something akin to returning a range between ord('A')
and ord('z')
, with each value being a character instead of an integer. Thus, we could accomplish the same task more explicitly with code like this:
<?php
for($i = ord('A'); $i <= ord('z'); ++$i) {
print chr($i) . PHP_EOL;
}
?>
The code above produces exactly the same output as we saw when we did foreach (range('A', 'z') as $char)
.
The fact that the range()
operator works this way means that we can produce some interesting ranges:
<?php
foreach (range('!', '/') as $char) {
print $char . PHP_EOL;
}
?>
This will loop from ASCII character 33 to ASCII character 47, generating output like this:
!
"
#
$
%
&
'
(
)
*
+
,
-
.
/
Ranges can also decrement. Just as range(9, 1)
will produce an array beginning with 9 and decrementing down to 1, so we can do:
<?php
foreach (range('f', 'a') as $char) {
print $char . PHP_EOL;
}
?>
The above would produce:
f
e
d
c
b
a
Thus, while $char--
does not appear to work, decrementing can still be accomplished using the range()
function.
One final feature of range is its ability (as of PHP5) to add a third parameter, an integer which indicates the step. A step will determine how many to add to the base value on each iteration. E.g. a step of 2 will effectively grab every other value. All odd numbers between one and ten can be printed like this:
<?php
foreach (range(1, 10, 2) as $char) {
print $char . PHP_EOL;
}
?>
This will print 1, add 2 to 1, and then print that (3), and then add two more (5) and then print that....
Stepping works with letters as well. Here we can print every third letter, starting with a
:
<?php
foreach (range('a', 'z', 3) as $char) {
print $char . PHP_EOL;
}
?>
The results will look like this:
a
d
g
j
m
p
s
v
y
First, a
is printed, then every third letter after a
is printed.
How is this useful?
So how is any of this incrementing characters stuff useful, anyway? Certainly, there are a host of cases that I am not creative enough to imagine... but here is one easy case.
One web page element that I've implemented enough times in my life is the so-called Alphabet Wheel. An Alphabet Wheel is a list of linked letters, beginning with a
and ending with z
. Typically, it is used as a method of paginating information.
With range()
, we can generate an Alphabet Wheel in just a few lines:
<?php
foreach (range('a', 'z') as $char) {
print "<a href='http://example.com/index.php?letter=$char'>$char</a><br/>";
}
?>
This code loops through the range and link after link, each pointing to a different letter of the alphabet.
That wraps up this article. We've covered both incrementing characters and getting arrays of characters with the range()
function. These rarely used features of PHP can make certain repetitive tasks easier.