Showing posts with label WebDAV. Show all posts
Showing posts with label WebDAV. Show all posts

Thursday, 8 November 2007

Unique lock tokens in MySource4

We've recently moved onto designing the new MySource4 locking system. My work on locking during the WebDAV development lead me to the idea of unique lock tokens, which I had not previously considered for the locking system. I never saw a real need for lock tokens to be unique each time a lock is acquired on a resource, but now I have.

In MySource Matrix, each lock you acquire has a unique system-wide lock token that comprises of the asset ID and a lock type (e.g., attributes or linking). Each time that lock is acquired, the same lock token is generated.

MySource4 uses an alternative system where the lock token is always unique over time by using a UUID. This means that when you acquire a lock, MySource4 will give you this lock token and you will need to supply it back to MySource4 when you save. These lock tokens are very random and cannot be guessed. This is the same way WebDAV clients work.

The real difference in these two systems is trust.

In MySource Matrix, if you are logged in as the same user that acquired the lock, you can use that lock to save content. You do not need to supply a lock token so you can log into two different browser sessions and happily lock and save content in each.

In MySource4, if you acquire a lock in one browser, you cannot reuse that lock in another. The editing interface (acting as the client here) has to send the unique lock token back to MySource4 to save the content. The editing interface in the second browser will not know the correct lock token, and so the save will fail.

That's really all behind the scenes, and users don't need to worry about that when they lock content. In fact, users never have to explicitly lock content in MySource4, so even though the locking system is more complicated, the end-user experience is greatly simplified. It does stop one important conflict from occurring though; two editors logged in as the same user overriding the content of the other.

Although it is highly discouraged, editors will sometimes log into the system using the same username and password. Even worse, this can happen with the root account. When this happens, the locking system has no way to differentiate between the two editors, and so the locks of one become the locks of the other. This can lead to a situation where both editors are editing the same content at once and the last one to save wins.

In MySource4, once the first editor locks a piece of content, the second editor will see the content as locked because they do not have the correct lock token. So even though these two editors are logged in as the same user, they are treated as different users by the locking system.

This small change will lead to less chance of editing conflicts, including any editing that might take place in a replicated editing environment.

Friday, 5 October 2007

Deleting folders in WebDAV

I was coding up support for the DELETE request in WebDAV and came across yet another difference between Windows and OS X. I had a good run of things "just working" on both operating systems, but I guess it had to come to an end eventually.

Let's say I have the following directory:

Images
|_ img1.png
|_ img2.png
|_ img3.png
When you delete the directory in OS X, the following requests are sent:
DELETE Images/img3.png
DELETE Images/img2.png
DELETE Images/img1.png
DELETE Images
When you delete the directory in Windows, the following requests are sent:
DELETE Images

Spot the difference?

I'd rather have a different DELETE request sent for each file and folder, allowing me to delete the temporary files along with the DB entries much more easily. Having said that, I'm not totally against "the Windows way". It is better for performance but it will leave a lot of temp files hanging around because the request would take too long if I went through and deleted each one. That's not really a problem because they will be cleaned up eventually, or reused.

Luckily, OS X sends the requests from the bottom up, allowing me to process both methods in the same way. So on OS X, temp files will be deleted; on Windows, they will not be.

It's not the individual processes that get to me. It's that fact that there are (again) two different ways of doing things.

...and don't get me started on Vista!

Sunday, 30 September 2007

MySource 4.0 WebDAV update

In a previous post, I mentioned some WebDAV improvements that I was thinking of implementing in MySource4. I've finally got around to trying some of them out, with positive results. The improvement I'm currently making to MySource4's WebDAV server is the ability to add files and folders.

Adding this feature was a lot easier than I thought, but the functionality it provides is much more impressive. I'm now able to drag files from my local machine into the WebDAV folder. I can also drag entire folders and have them imported along with their sub-folders and files. This makes mass-importing of files and folders as easy as copy/paste and is going to be an important part of any document management system.

The files and folders that are imported exist only in WebDAV and are not imported into MySource4 because I can't trust most WebDAV requests. When an administration interface is added to MySource4, I'll put something into it that allows the changes made in WebDAV to be committed to the MySource4 system. For now though, all imported files and folders can be viewed and edited by all editors using the WebDAV interface, so it works sort of like a staging area.

As much pain as WebDAV has caused me in the past, I've fallen deeply in love with it again!

Thursday, 14 June 2007

MySource 4.0 internal demo

Today, I gave the first internal demonstration of MySource 4.0 to Squiz. At lot of people have been waiting a long time to get a look at our progress and it was great to finally be able to show them what we've been working on.

The first part of the demonstration was MySource 4.0 itself. I showed the installation process and created a simple five page website with an imported design. The inline content editing interfaces were shown with raw HTML text boxes only. Finally, I demonstrated MySource 4.0's WebDAV integration by editing a Word Document that was uploaded through the content editing interface.

The second part was a batch processing test. I added and deleted a web path and role to 135,000 pages that we had previously imported into a test system. Adding took around 26 seconds. Deleting took around 6 seconds. To compare that to MySource Matrix, we would be acting upon somewhere between 540,000 and 800,000 assets and both adding and deleting would take somewhere between 10 and 24 hours, depending on hardware and load.

The last part of the demonstration was a preview of Viper, the MySource 4.0 WYSIWYG editor. This was really exciting and drove home our goal of inline content editing. Unlike other WYWISYG editors, Viper is written in JavaScript but does not use the browser's built-in editable region. The result is an editor where we have complete control over how it works, how it looks and the HTML it produces.

We are now busy preparing features for the first public demonstration at the MySource Matrix International User Conference 2007 in September.

Friday, 30 March 2007

MySource 4.0 WebDAV ideas

I've been thinking about how to implement some new features and improvements into the MySource 4.0 WebDAV server. All these changes would go in after the alpha release and user feedback, but I thought I'd try and list them in one place so I don't forget about them.

Adding new files and folders
The problem with adding (and deleting) files and folders is actually editing. That sounds a bit weird, so let me explain.

When you edit files, the application or file system decides on the process it will use to save the file. Most applications don't make things easy by using a simple LOCK then PUT (a new version) then UNLOCK. Most applications I test with move the current file and create temp files and directories before finally renaming a temp file to replace the existing file we are editing. This means that I get a lot of DELETE, MOVE, PUT and MKCOL requests coming through. If I actually processed all of these requests in the CMS, assets would be moving all over the place, and often being replaced by new assets with the same name while they end up in the trash.

To solve this problem, I don't action any of these request when asked. Instead, I keep a record of where files and folders should be. This allows me to pretend that files have been moved, created or deleted without having to actually do it. Then when a temp file is moved to the location of an existing file, I update the file's contents in the CMS.

That is a lot of background to basically say that I can never trust a PUT request. There is no way of knowing if it is a real request to create a new file or if it is just an application trying to save an existing file. Forget sessions and cookies to store information about what is going on. They are not supported in WebDAV.

So here is my idea. When someone drags a file into a folder, I pretend that it exists. I don't currently show that file in the folder, but I could. If I do show the file, I can also allow it to be edited. For users of WebDAV, it would appear that the file has been created in the CMS, but it really only exists in the WebDAV database. To commit the file to the CMS, a user could go to some location in MySource 4.0 and indicate that a pending WebDAV file should be turned into a real File asset.

Deleting existing files and folders
For the same reason as adding files and folders is a problem, deleting is a problem as well. There is no way to tell if the user deleted the file or if the application deleted it while saving.

In the same way that added files can be shown, deleted files can be hidden. In fact, I already do hide them because it causes OS X's Finder to go crazy if it thinks a file should be deleted but it is still there (fair enough I guess). Again, a user could go to a location in MySource 4.0 and indicate that files marked as deleted should be moved to the trash within the CMS.

Authentication and permissions
Forcing a user to authenticate before connecting to the WebDAV server is dead easy. However, you do need to re-authenticate every time a request is made because sessions (cookies) are not supported. That isn't a problem, but it is something to remember when this functionality is added.

Similarly, checking permissions on assets before displaying them is not going to be hard. Where I do see things getting a little interesting is the display of projects and project folders. They don't have any specific permissions applied to them, so who do we show them too? We could say that any user can see the projects in which their account exists, or we could say that they see projects where they can read at least one item in the project. The first method is quick, the second is slow. Unsurprisingly, I'm leaning towards the first option.

Temporary items on OS X
I noticed, completely by accident, that if I told Word on OS X that the Temporary Items folder didn't exist, it would save the file using less requests. The figures are something like 30 requests versus 22. Instead of moving temporary files around, it would just create backup copies in the same location as the existing file.

I do have support for the Temporary Items folder in the WebDAV server, but I'm wondering if taking out that support might actually improve performance on OS X. Word on Windows doesn't suffer from the same problem as it choses to save files differently (of course).

I'd like to do some testing to see if this could be a possible improvement to the WebDAV server.

Tuesday, 27 March 2007

MySource 4.0 WebDAV complete

Well, complete enough for the alpha release anyway. There is no authentication or permission checking in there, and you cant add and remove files and folders. I didn't leave these features out because they were hard, but because I don't want to finish planing this server without some feedback from users first. I have some ideas about how I want those features to work, so I'm interested to see if they go down well.

I'll write another post with my ideas in it later. This one is all about the fact that the server finally works. I didn't have to change any of the functionality to get it working on Windows, but it has been quite a wait to test it.

The problems I had were interesting.

First, I was using Lighttpd as the web server. Everything worked fine on OS X, but when saving a Word document on Windows, Lighttpd would receive a very specific GET request from Word and drop it. That is, it would return a 400 (bad request) error without ever asking by script to process the request. This caused Word to always open the file as read-only. There was nothing I could do about this error. I couldn't get Word to change the request and I couldn't get Lighttpd to accept it.

The server admins then set me up with Apache running the php5-cgi module (our Apache already runs mod-php4, so we couldn't just use mod-php5). Everything was working great with my existing saved connections until I tried creating a new connection for testing. The very first OPTIONS request that gets sent to see what version of the WebDAV protocol is being used was being handled by Apache without asking my script to process the request. Apparently, there was a bug in Apache where it handled the OPTIONS request for all CGI scripts. The version of Apache on our dev box could not be updated for a few weeks, so we had to find another way.

The solution turned out to be FastCGI on Apache. Every request made it through to my script and saving/browsing was working on both OS X and Windows XP.

It's been a long road, but the functionality this server provides will be a fantastic addition to MySource 4.0. I'm looking forward to trying it out on some real data when the alpha is released.

Saturday, 10 March 2007

WedDAV test server

While trying to work out why I couldn't save Microsoft Word documents from OS X, it occurred to me that a good way to get it working was to look at what other WebDAV servers do and try to copy them. I used this technique with Ethereal and a Windows Sever 2003 box to get my prototype server working earlier in the year. The only problem this time was that Microsoft Word on OS X couldn't save files on a Windows Server 2003 WebDAV share running under IIS, and that's a big problem for me.

I'm not sure why saving doesn't work, but I tried it on two different Macs using two different versions of Microsoft Word and both failed. On a Windows XP machine, saving works fine. I needed to find another WebDAV server to use for testing, or configure one myself. My limited access to server hardware meant that I was going to have to go with the first option.

Enter the WebDAV Testing Server. This is a free public WebDAV server that runs Apache with mod_dav to provide WebDAV functionality. It's primary goal is to provide a test server for developers writing WebDAV clients, but it also happens to come in handy when writing your own server.

Luckily for me, this test server can read and write Microsoft Word documents on OS X just fine, so I used Ethereal to compare the requests and responses and I can now (finally) save Microsoft Word docs from OS X into MySource 4.0. As an added bonus, my problems with saving from Photoshop have gone away as well. One of the changes made today must have caused it to start working, so things are looking up.

Saturday, 3 March 2007

MySource 4.0 WebDAV server started

I've been waiting about a month for MySource 4.0 development to reach a point where it made sense to begin porting my stand-alone WebDAV server to MySource 4.0. This week, it finally caught-up.

Currently, my WebDav server can display the projects in your system as folders. Three project folders are listed under each; Documents, Images and Movies. The server currently only supports the viewing of Folder and File assets under these folders, but the mime type of each File asset is determined, so they appear as different file types in the OS.

I've had to put the ID of the asset in the path for now (which means it also appears in the folder/file name) because we don't have URLs assigned to project folders yet. For example, a folder name would be something like [23] Holiday Photos. Once these URLs go in, I'll be able to use the web path of the asset instead of the ID. For example, a folder name would be something like holiday_photos. I'm thinking of making this an option though, because some people may prefer seeing the name rather than the web path, even if the name does have the asset ID on the front. It's also one less query in most cases, so that is a bonus too. It's just a lot harder to remember a URL with /[23]%20Holiday%20Photos/ in it than /holiday_photos/ if you want to navigate to that folder directly.

The next thing to tackle for WebDAV is the editing of files, which includes working out a permanent solution for locking. Supporting the creation of new files and folders would be the next logical step, before finally moving the server into MySource 4.0 and converting it to a System with Actions and Channels.

Tuesday, 30 January 2007

MySource 4.0 WebDAV update

Today, finally, I managed to get both Windows and OS X to open and edit MS Office documents via WebDAV. It's been a real pain, but I'm finally in a position where I can start migrating all my hard-coded logic into MySource 4.0. The only problem; MySource 4.0 isn't ready for it yet.

I really need to wait until the Project system is developed before I start trying to get WebDAV working within MySource 4.0. A lot of decisions need to be made about what folders will be made available and at what URLs. I'm hoping that work on the Project system will begin next week, so I'm looking at about mid/late Feb before WebDAV integration begins.

So what are my impressions of WebDAV? Well, I think it is a fantastic idea and should allow for much easier content editing and mass uploading of content. But it does have its limits. It is really most suited to editing files (like Word Documents) rather than HTML content, which makes up most of the content in the average website. Sure, you can edit the HTML of a page, but you lose all the tools the CMS provides, like internal linking and content reuse.

I also love OS X even more now! WebDAV support in OS X is just fantastic, and is really the way I assumed all operating systems would implement WebDAV. OS X "maps" a WebDAV server to make it look like an external drive. All saving and locking is handled at the OS level, so any application on OS X can support WebDAV by doing... well, nothing. This allows me to edit text files in my favorite editor rather than a specific "WebDAV enabled" one.

The down-side? OS X loves writing "._" files everywhere, and I've needed to support these files to an extent to make sure all applications work correctly. Obviously, I don't want them written to the CMS, so I store them in a temporary directory and manipulate them there.

Windows is... different. Explorer supports WebDAV through Web Folders. You can browse a Web Folder, but you can't open and edit any file you want. The suggested way for using Web Folders and WebDAV is to copy the remote file to your local machine, edit it, and then re-upload it. You can also drag files directly into the Web Folder to upload them. To do any real editing, you need to use an application that supports editing via WebDAV itself. The OS will not do it for you.

What does this mean? Well for one thing, the number of applications you can use for editing is greatly limited. I can't, for example, edit a text file in Notepad because Notepad doesn't know how to edit a WebDAV document. I can use WebDAV enabled applications like the MS Office suite or Dreamweaver to edit content, so at least the big names have support.

The other thing that really annoyed me was that Windows, when creating a Web Folder, will send an OPTIONS request to the root of the server. I work on a shared development server, and my URL is something like delta.squiz.net/~gsherwood/webdav_server.php. However, I can't use this URL to create a Web Folder because the OPTIONS request that goes to delta.squiz.net is ignored, so Windows thinks the server is invalid. To get around this, I've had to get a new domain configured in Apache that sets the document root to my WebDAV directory. An alias has also been configured to point all traffic on my new domain to a single PHP file; my WebDAV server. It's a pain, but it works.

Bad points aside, it was Microsoft that helped me get WebDAV working in the first place. I found the MSDN article on the WebDAV LOCK method very informative. I also configured a WedDAV server running on IIS and analysed the requests that two Windows boxes were sending to each other using Ethereal (I highly recommend this approach if it is available to you). Plus, seeing a PHP app that runs on Linux integrate so closely with Windows is a beautiful sight to see, so I can't be completely against Microsoft on this one.

Tuesday, 23 January 2007

WebDAV support scheduled for MySource 4.0 alpha release

We've been having a think about the features we want to add to MySource 4.0 after the alpha release. WebDAV support was on the list of nice-to-have's, but the current demand (both internal and external to Squiz) for integrated content editing has pushed it to into the alpha release.

So I've been trying to get WebDAV support going in MySource 4.0. To say it's been a pain in the arse is putting it mildly.

I started playing with WebDAV in MySource Matrix quite a long time ago. I got to the point where I could browse a MySource Matrix system via WebDAV, but never implemented content editing. I ran out of time and need some client funding to keep the project going. It never arrived, so the project was shelved.

I started back on WebDAV yesterday and got myself back to the point where I needed to implement content editing today. As of now, I can safely say that today has been one of the most horrible in memory, and I still don't have content editing via WebDAV!

The thing that really bothers me is that every WebDAV client likes to do things differently. There is also a severe lack of examples for WebDAV support in PHP, and those I do have don't actually work for content editing. So I think I'm going to need to do a lot more reading and implement a lot of the protocol myself. That wouldn't normally annoy me, but now I'm on a deadline, and the clock is ticking.