Tuesday, 12 February 2008

When PHP string comparions go wrong

I'm a little ashamed to admit that I didn't know about this before, but I've just never come across it. When PHP compares two strings it converts them to integers if they both appear to be numbers, unless you use the === operator to compare types as well. The comparison operator docs state:

If you compare an integer with a string, the string is converted to a number. If you compare two numerical strings, they are compared as integers. These rules also apply to the switch statement.

Keeping that in mind, take a look at this code:
if ('000E00080001' == '000E0008') {
echo 'equal';
This statement evaluates to TRUE, but I wasn't sure why.

Even reading the comparison operator docs doesn't make it clear, but it does link off to a very important piece of information about strings. In particular, the section on string conversion to numbers states:
The string will evaluate as a float if it contains any of the characters '.', 'e', or 'E'. Otherwise, it will evaluate as an integer.

And the truth shall set you free!

The letter "e" in my two strings is considered part of the exponent of a number. In both cases, the number is "0" followed by an exponent. When PHP compares those two strings, it compares them as "zero to the power of ..." because they are both considered numbers. So in this case, it is comparing zero with zero and finding it is TRUE.

This also works with decimal points. This evaluates to TRUE as well:
if ('0000.' == '0000.0') {
echo 'equal';
And just to confirm what is going on, adding a character that is not considered part of a number results in the expected behavior. This evaluates to FALSE because PHP no longer considers these values numbers:
if ('000E000F0001' == '000E000F') {
echo 'equal';
And, if the initial portion of our number is not zero, the exponent modifies the value. We are no longer comparing "zero to the power of ..." (which is always zero) in the following example; we are comparing "one to the power of ...", so it evaluates to FASLE:
if ('001E00080001' == '001E0008') {
echo 'equal';
So what is the solution? You can either use strcmp() to compare your strings, or use the === operator to compare types as well. When PHP compares types, it will not try and convert strings to numbers.

Notch one up for using the === equal operator, which we use exclusively in MySource4. We even go as far as banning type-insensitive and implicit comparisons using PHP_CodeSniffer. If you want to do the same, you can get PHP_CodeSniffer from PEAR and use the included Squiz standard.