Thursday 14 February 2008

PHP_CodeSniffer == JS_CodeSniffer?

There has been one item on my PHP_CodeSniffer todo list for a long time; to get PHP_CodeSniffer to enforce coding standards for JavaScript files. I've finally found the time to get going with this project and I've made good progress over the last couple of days.

My local copy of PHP_CodeSniffer now uses the file extension to determine which tokenizer to use to parse each file. I've moved the tokenizing code out of PHP_CodeSniffer_File into a PHP tokenizer and have added a new JS tokenizer. Once the file is tokenized, all existing sniffs can be run on it, but obviously most of the existing ones are only going to work correctly for PHP files.

The next step is to look at all the existing sniffs and identify those that are PHP only and those that would also work for JS files. Once that happens, each will be flagged using a protected member var so developers of coding standards know which tokenizers a sniff supports. I'll only be doing this for the Squiz and Generic standards to start with as I'm sure PEAR has no desire to start writing a JavaScript coding standard.

For developers who have written their own sniffs, you don't need to change anything to keep your existing sniffs and standards working. You only need to make changes if you want to start checking JavaScript files as well.

My overall goal is to get PHP_CodeSniffer to check all file types that PHP web developers would commonly use, including JavaScript, CSS, HTML and XML. It's going to take a while to get there, but JavaScript files are easily the most complex files to parse in that list, so I'm well on the way. I'm hoping to get this change into CVS sometime next week and get a release candidate out after I've written a decent number of JS sniffs.

Tuesday 12 February 2008

When PHP string comparions go wrong

I'm a little ashamed to admit that I didn't know about this before, but I've just never come across it. When PHP compares two strings it converts them to integers if they both appear to be numbers, unless you use the === operator to compare types as well. The comparison operator docs state:

If you compare an integer with a string, the string is converted to a number. If you compare two numerical strings, they are compared as integers. These rules also apply to the switch statement.

Keeping that in mind, take a look at this code:
if ('000E00080001' == '000E0008') {
echo 'equal';
}
This statement evaluates to TRUE, but I wasn't sure why.

Even reading the comparison operator docs doesn't make it clear, but it does link off to a very important piece of information about strings. In particular, the section on string conversion to numbers states:
The string will evaluate as a float if it contains any of the characters '.', 'e', or 'E'. Otherwise, it will evaluate as an integer.

And the truth shall set you free!

The letter "e" in my two strings is considered part of the exponent of a number. In both cases, the number is "0" followed by an exponent. When PHP compares those two strings, it compares them as "zero to the power of ..." because they are both considered numbers. So in this case, it is comparing zero with zero and finding it is TRUE.

This also works with decimal points. This evaluates to TRUE as well:
if ('0000.' == '0000.0') {
echo 'equal';
}
And just to confirm what is going on, adding a character that is not considered part of a number results in the expected behavior. This evaluates to FALSE because PHP no longer considers these values numbers:
if ('000E000F0001' == '000E000F') {
echo 'equal';
}
And, if the initial portion of our number is not zero, the exponent modifies the value. We are no longer comparing "zero to the power of ..." (which is always zero) in the following example; we are comparing "one to the power of ...", so it evaluates to FASLE:
if ('001E00080001' == '001E0008') {
echo 'equal';
}
So what is the solution? You can either use strcmp() to compare your strings, or use the === operator to compare types as well. When PHP compares types, it will not try and convert strings to numbers.

Notch one up for using the === equal operator, which we use exclusively in MySource4. We even go as far as banning type-insensitive and implicit comparisons using PHP_CodeSniffer. If you want to do the same, you can get PHP_CodeSniffer from PEAR and use the included Squiz standard.

Monday 4 February 2008

PHP_CodeSniffer 1.0.1 released

I've just uploaded PHP_CodeSniffer version 1.0.1, which contains 6 bug fixes and adds some new sniffs into the Squiz and MySource standards after some lobbying by the MySource4 team. The coding standard we use for MySource4 now has 94 different sniffs, each containing up to 5 different checks, so it is getting fairly strict. The code should be pretty enough to frame and stick on your wall!

One important change has been made to the PEAR standard in this release. A recent RFC that asked PEAR developers to vote on forcing protected member vars to be prefixed with an underscore got me looking for the relevant sniff in PHP_CodeSniffer. (I didn't see the call for votes, but I would have voted not to prefix protected vars with an underscore.) While PEAR decided against adding this new standard, I did realise that PHP_CodeSniffer was not currently enforcing the existing standard. The 1.0.1 release adds a new sniff to the PEAR standard to ensure only private members vars are prefixed with an underscore.

You can view the full changelog, and download the release, on the package download page.