Greg Sherwood: January 2007

Tuesday, 30 January 2007

Long error messages now wrap in PHP_CodeSniffer

I added a new feature to PHP_Codesniffer yesterday that wraps error and warning messages to 80 characters. I wanted to blog about this feature because a patch was actually submitted to me by Endre Czirbesz, a user of PHP_CodeSniffer. I made some minor modifications to the patch to bring it up-to-date with the latest CVS code, but it's safe to say that this feature would not exist without Endre's help.

Here is an example of some error messages without wrapping (you'll have to forgive this Blogger template for cutting the content short):

FILE: FunctionClosingBraceSpaceUnitTest.inc
--------------------------------------------------------------------------------
FOUND 14 ERROR(S) AND 0 WARNING(S) AFFECTING 6 LINE(S)
--------------------------------------------------------------------------------
2 | ERROR | Missing file doc comment
3 | ERROR | Missing function doc comment
3 | ERROR | Opening function brace should be on a new line.
3 | ERROR | Function name "MyFunction1" is invalid; consider "myFunction1" instead
8 | ERROR | You must use "/**" style comments for a function comment
8 | ERROR | Opening function brace should be on a new line.
8 | ERROR | Function name "MyFunction2" is invalid; consider "myFunction2" instead
--------------------------------------------------------------------------------

And here are the same errors with wrapping:

FILE: FunctionClosingBraceSpaceUnitTest.inc
--------------------------------------------------------------------------------
FOUND 14 ERROR(S) AND 0 WARNING(S) AFFECTING 6 LINE(S)
--------------------------------------------------------------------------------
2 | ERROR | Missing file doc comment
3 | ERROR | Missing function doc comment
3 | ERROR | Opening function brace should be on a new line
3 | ERROR | Function name "MyFunction1" is invalid; consider "myFunction1"
  |       | instead
8 | ERROR | You must use "/**" style comments for a function comment
8 | ERROR | Opening function brace should be on a new line
8 | ERROR | Function name "MyFunction2" is invalid; consider "myFunction2"
  |       | instead
--------------------------------------------------------------------------------

Even without truly excessive error messages, the new wrapped format is much easier to read or send in an email notification. It also allows for more descriptive error messages in the future, if required.

You can get the latest PHP_CodeSniffer with this feature in it from CVS right now, or wait until the next official release, which is probably still a few weeks away.

MySource 4.0 WebDAV update

Today, finally, I managed to get both Windows and OS X to open and edit MS Office documents via WebDAV. It's been a real pain, but I'm finally in a position where I can start migrating all my hard-coded logic into MySource 4.0. The only problem; MySource 4.0 isn't ready for it yet.

I really need to wait until the Project system is developed before I start trying to get WebDAV working within MySource 4.0. A lot of decisions need to be made about what folders will be made available and at what URLs. I'm hoping that work on the Project system will begin next week, so I'm looking at about mid/late Feb before WebDAV integration begins.

So what are my impressions of WebDAV? Well, I think it is a fantastic idea and should allow for much easier content editing and mass uploading of content. But it does have its limits. It is really most suited to editing files (like Word Documents) rather than HTML content, which makes up most of the content in the average website. Sure, you can edit the HTML of a page, but you lose all the tools the CMS provides, like internal linking and content reuse.

I also love OS X even more now! WebDAV support in OS X is just fantastic, and is really the way I assumed all operating systems would implement WebDAV. OS X "maps" a WebDAV server to make it look like an external drive. All saving and locking is handled at the OS level, so any application on OS X can support WebDAV by doing... well, nothing. This allows me to edit text files in my favorite editor rather than a specific "WebDAV enabled" one.

The down-side? OS X loves writing "._" files everywhere, and I've needed to support these files to an extent to make sure all applications work correctly. Obviously, I don't want them written to the CMS, so I store them in a temporary directory and manipulate them there.

Windows is... different. Explorer supports WebDAV through Web Folders. You can browse a Web Folder, but you can't open and edit any file you want. The suggested way for using Web Folders and WebDAV is to copy the remote file to your local machine, edit it, and then re-upload it. You can also drag files directly into the Web Folder to upload them. To do any real editing, you need to use an application that supports editing via WebDAV itself. The OS will not do it for you.

What does this mean? Well for one thing, the number of applications you can use for editing is greatly limited. I can't, for example, edit a text file in Notepad because Notepad doesn't know how to edit a WebDAV document. I can use WebDAV enabled applications like the MS Office suite or Dreamweaver to edit content, so at least the big names have support.

The other thing that really annoyed me was that Windows, when creating a Web Folder, will send an OPTIONS request to the root of the server. I work on a shared development server, and my URL is something like delta.squiz.net/~gsherwood/webdav_server.php. However, I can't use this URL to create a Web Folder because the OPTIONS request that goes to delta.squiz.net is ignored, so Windows thinks the server is invalid. To get around this, I've had to get a new domain configured in Apache that sets the document root to my WebDAV directory. An alias has also been configured to point all traffic on my new domain to a single PHP file; my WebDAV server. It's a pain, but it works.

Bad points aside, it was Microsoft that helped me get WebDAV working in the first place. I found the MSDN article on the WebDAV LOCK method very informative. I also configured a WedDAV server running on IIS and analysed the requests that two Windows boxes were sending to each other using Ethereal (I highly recommend this approach if it is available to you). Plus, seeing a PHP app that runs on Linux integrate so closely with Windows is a beautiful sight to see, so I can't be completely against Microsoft on this one.

Tuesday, 23 January 2007

WebDAV support scheduled for MySource 4.0 alpha release

We've been having a think about the features we want to add to MySource 4.0 after the alpha release. WebDAV support was on the list of nice-to-have's, but the current demand (both internal and external to Squiz) for integrated content editing has pushed it to into the alpha release.

So I've been trying to get WebDAV support going in MySource 4.0. To say it's been a pain in the arse is putting it mildly.

I started playing with WebDAV in MySource Matrix quite a long time ago. I got to the point where I could browse a MySource Matrix system via WebDAV, but never implemented content editing. I ran out of time and need some client funding to keep the project going. It never arrived, so the project was shelved.

I started back on WebDAV yesterday and got myself back to the point where I needed to implement content editing today. As of now, I can safely say that today has been one of the most horrible in memory, and I still don't have content editing via WebDAV!

The thing that really bothers me is that every WebDAV client likes to do things differently. There is also a severe lack of examples for WebDAV support in PHP, and those I do have don't actually work for content editing. So I think I'm going to need to do a lot more reading and implement a lot of the protocol myself. That wouldn't normally annoy me, but now I'm on a deadline, and the clock is ticking.

Tuesday, 16 January 2007

Who will provide decoupled content management services? MySource 4.0 will.

CMS Watch has posted a new article about decoupling content management services. The article talks about the ability for customers to use a CMS but also use 3rd party products to provide services that the CMS would normally offer natively.

Mediasurface, Stellent, Documentum and many other vendors have successfully decoupled their repository search capabilities from their underlying CMS. Other observers have asked for decoupling of other services like security, and I have often seen customers asking if they can use their own choice of workflow software, version management software, and so on, instead of using features embedded in the CMS.

The article goes on to say:

To allow decoupling (or even loose coupling) of services, most of them [CMS vendors] would need to re-architect their products.

Very true.

To use MySource Matrix as an example, if we were asked to integrate a 3rd party workflow system into the product, the only solution would be a very messy trigger-based setup or a custom CMS product for that client. Workflow is so tightly integrated into the core product that removing it, or replacing it, is not practical.

Enter MySource 4.0.

One of the primary goals we had during the early planning phase of MySource 4.0 was to completely decouple the core. Services like workflow, versioning, status, metadata and permissions have been designed to be independent services with either no, or very loose coupling to each other. This goal was focussed on providing customised CMS solutions to our clients, allowing them to remove or replace these core services.

For example, workflow adds a level of complexity to the publishing process. If a custom does not require complex workflow rules, the workflow system can be removed completely without affecting the functionality of other core services. Similiarly, if the default workflow engine does not work for a customer's organisation, a new engine can be written and the default can be replaced.

This core architectural feature makes MySource 4.0 ideally suited to the integration of 3rd party services into the core. We have already anticipated support for 3rd party search engines, but it will be interesting to see what other core CMS services our customers would like replace.

Monday, 15 January 2007

MySource Matrix 3.12.0 released

Version 3.12.0 (stable) of MySource Matrix was released today. The additional week between the expected and actual release dates was used to make the release ready for the new dual licencing scheme; MySource Matrix is now available under both the GPL and the new SSV (Squiz Supported Version) licence.

In addition, the core (GPL) version of 3.12.0 now contains a significant number of new modules that were previously only available through the commercial module package. The entire CMS package, minus the Remote Content asset, and the entire News package are now freely available in the download.

Some of the new asset types included in the download are the Custom Form, the Asset Builder, the Online Poll and the RSS Feed. This is an incredible increase in functionality available in the free download.

If you are an existing user of MySource Matrix, make sure you view the updated licence page on the Squiz website for more information about the new licencing scheme, and the updated modules section of the MySource Matrix website for more information about which modules are only available under the SSV.

And of course, you can download MySource Matrix from the downloads page.

UPDATE: you can view the 3.12.0 press release on the Squiz website.

Friday, 12 January 2007

PHP_CodeSniffer DocBook documentation

I started writing the documentation for PHP_CodeSniffer today. Before I can release a stable version, I have to make sure there is some documentation in the PEAR manual (peardoc).

The catch? The PEAR manual is written in DocBook XML and is pretty hard to get working, at least on OS X. Chapter 21 of the PEAR manual describes how to write documentation and the software you need to install to get it working. The section Required software reads:

Unfortunately, installing that software can be difficult under some circumstances. If you are unable to get it working, don't use that as an excuse for not writing documentation. There are two test servers that automatically download peardoc from CVS and build the manual. Any parsing errors your changes cause will show up in the logs the next time the build happens:

That didn't fill me with confidence.

Checking out peardoc from CVS was easy, but of course, my first attempt at trying to compile the manual failed; peardoc couldn't find my DSSSL stylesheets. I was not prepared to commit any documentation unless I could test it locally.

Luckily for me, I started playing around with peardoc 12 months ago, when work was a little less busy. Apparently, I had been successful in installing OpenJade and the DSSSL stylesheets. The one thing missing was the command to tell peardoc where those stylesheets were.

The Testing documentation section of the PEAR manual says that the following commands will build peardoc on your local machine:

peardoc$ autoconf
peardoc$ ./configure --with-lang=en
peardoc$ make html

I'll describe that as mostly right. If peardoc doesn't know where your stylesheets are, you'll need to tell it, like this:

peardoc$ autoconf
peardoc$ ./configure --with-lang=en --with-dsssl=/path/to/docbook-dsssl/
peardoc$ make html

I got the DSSSL stylesheets from a Fink package called docbook-dsssl-nwalsh, so my stylesheet path is /sw/share/sgml/dsssl/docbook-dsssl-nwalsh/.

After several minutes, the PEAR manual was finally mine.

I started by porting over the docs about PHP_CodeSniffer that I had put up on the MySource Matrix website. This was my first real attempt at writing in DocBook format. I'll post about my impressions when I've finished writing all the docs, but first impressions are pretty good.

So now my first set of docs have been committed to peardoc. Now to wait for the next test manual generation to see if I did things correctly. It's a nervous wait!

UPDATE: the docs generated correctly and are now available in the PEAR manual.

PHP_CodeSniffer 0.3.0 (beta) released

I released version 0.3.0 of PHP_CodeSniffer through PEAR today. It's been quite a while since the last release, but that is only because I've been hard at work adding a lot of new sniffs and doing a lot of work testing the existing ones.

The biggest change in this version is the addition of a new coding standard. PHP_CodeSniffer did come with the PEAR coding standard loaded by default, but it now comes with the Squiz coding standard as well. I wasn't sure how the addition of an external coding standard was going to be received, so I hadn't really considered making the Squiz standard part of the core download. I decided to add the standard in the end because I think it is important to offer users some choice over the coding standards they use, and the Squiz standard provides a great deal of additional sniffs that can be used together or in a user-defined standard.

The PEAR standard also (finally) got a set of doc comment tests that check for file, function and class doc comments. This is a really great milestone for PHP_CodeSniffer development as these were the last sniffs that needed to be written before I could consider a beta release.

So for that reason, PHP_CodeSniffer is now finally in beta. I'm not planning any additional development for the RC or stable release. Obviously, bugs will be fixed if they are submitted, but no new features are planned for development. I really want to stabilise what we have and get out version 1.0.0 so more people can start using it in their projects.

Tuesday, 9 January 2007

MySource Matrix 3.12.0 delayed

The release of MySource Matrix version 3.12.0, the first stable release on the 3.12 branch, has been delayed by one week. Some last minute development was planned and implemented on Friday, with some minor changes made both yesterday and today.

This new development does not affect any core functionality, so we have no need for a month-long RC2 phase.

The new release date for 3.12.0 is Monday the 15th Jan, 2007.

On a related note, versions 3.8.10 and 3.10.5 were released yesterday. Version 3.8.10 is the final regular release on the 3.8 stable branch and contains 8 bug fixes. Version 3.10.5 is a regular maintenance release for the 3.10 stable branch and contains more than 20 bug fixes as well as a couple of minor feature additions.

Thursday, 4 January 2007

PHP array of objects, passed by reference; giving me a headache!

With all the Squiz sniffs committed, I started running PHP_CodeSniffer over itself to check for any coding standard errors, and to give it a test run on some real code after such a large amount of dev. I wasn't expecting any problems; all the unit tests were passing and I hadn't done any reworking of the core.

What I got was an exception thrown almost immediately, and on a file I'd checked plenty of times previously with older versions of PHP_CodeSniffer. Before I explain the problem, let me explain how PHP_CodeSniffer creates objects.

The main PHP_CodeSniffer object is created, goes through a coding standard directory (eg. PEAR) and creates an object for each sniff it finds in there. It then places that sniff object into an array, indexed by the type of tokens that the sniff wants to listen for. What you end up with is basically a big array of objects, stored in a private member variable called $_listeners.

When PHP_CodeSniffer starts processing a file, it creates a PHP_CodeSniffer_File object and passes it the $_listeners array. The PHP_CodeSniffer_File object then stores that array in its own private member var (also called $_listeners) and uses those listeners to process tokens as they are found. Some sniffs actually ask the PHP_CodeSniffer_File object to modify the $_listeners array for the current file.

When PHP_CodeSniffer has finished processing a file, it destroys the PHP_CodeSniffer_File object and creates a new instance of the class for the next file in the processing queue. It passes another copy of the $_listeners array to the new object so it can process its file.

And therein lies the problem; the passing of the $_listeners array from PHP_CodeSniffer to the PHP_CodeSniffer_File object.

Through debugging, it was pretty easy to work out that the problem was with object references. The $_listeners array passed to the PHP_CodeSniffer_File object should have been the same each time, but an MD5 of the serialized array showed that the array was different each time. There was a really simple reason for this; arrays containing objects are always passed by reference in PHP5.

Easy fix; clone the array and pass in the clone each time. Unfortunately, you can't clone an array of objects in PHP. So what's the solution? Iterate over the array and clone each object as they are found.

Forget that! If I'm going to be creating new objects through cloning each time, I may as well just recreate the array each time. It ends up being less work in the long run due to the way the array is structured and the fact that the same object can appear in the array more than once.

It's probably worth noting at this point that the array actually contains pointers to the objects and not the objects themselves, which is why the memory requirements of this array are not out of control and is also why PHP can't clone the array; it cant clone the reference to the object, only the object itself.

So I rewrote the way the $_listeners array gets to the PHP_CodeSniffer_File object, and after a fair bit of tweaking the unit tests were all passing again and PHP_CodeSniffer was running nicely over itself.

So what did I change? The PHP_CodeSniffer object now just recurses the coding standard directory and locates valid sniff files. It then includes the class and places the class name into a single-dimensional array, $_listeners. It pass a copy of the this array to the PHP_CodeSniffer_File object, which then loops over the array and recreates the old $_listeners array structure for itself to use. Once the file has finished processing, the object, and the $_listeners array, are destroyed.

But why were the unit tests passing? When the unit tests run, PHP_CodeSniffer only creates and adds a single sniff object to the $_listeners array, and only checks a single file with that sniff. This is to ensure that each sniff's test file is only recording problems with the particular sniff it belongs to and not generating other random errors from the rest of the standard. Because of this, the sniffs were never reused and never caused any problems.

So PHP_CodeSniffer now has a lot more object creation happening, but the speed and memory usage seems to be comparable to the old version. The changes aren't in CVS yet, but they will be once I've finished checking the rest of PHP_CodeSniffer for other obscure errors.

Tuesday, 2 January 2007

Squiz coding standard finally in PHP_CodeSniffer

I finally finished moving all the Squiz coding standard sniffs from SVN to PEAR CVS today. I also fixed up the 3 failing tests that have been bugging me for weeks. They weren't as hard to fix as I anticipated, which was a real bonus.

There are still 4 sniffs that need to be written before version 1.0 of the Squiz Coding Standard can be considered complete. The sniffs that check the file, class, member var and function doc comments are yet to be written. I sent around an email to the MySource 4.0 team today with a suggested commenting layout, so it shouldn't be long until those are sorted out too.

On a related note, the commenting sniffs for the PEAR coding standard are looking very close to being completed. A few minor adjustments are being made to the unit tests to cover off some checks that were not being tested (the code coverage reports really came through and did their job on that one). They look like being completed some time tomorrow.

I'm holding off on releasing another version of PHP_CodeSniffer until after all the sniffs (both PEAR and Squiz) are in place. That will be a really good time to upgrade the package's status from alpha to beta, finally.

Greg Sherwood