Archive for the 'RDF' Category

No More Feeds for WordPress?

Tuesday, January 17th, 2006

Let’s do away with the doom and gloom: In spite of recent context-deprived accounts, WordPress isn’t anywhere near ditching any of its feed formats. But what is all this talk about removing feeds from our favorite blogging tool? Let’s have a rational look…

WordPress currently supports four standard feed formats for syndication: RSS 0.92, RSS 1.0 (aka RDF), RSS 2.0 (aka RSS2), and Atom 0.3. All of these feeds are at least vague implementations of XML, and allow applications (feed aggregators) to gather data about your posts without actually visiting your site. This can be done because these formats are machine-readable - that is, a computer/server can be configured to automatically visit your feed and parse out the data that it provides into usable chunks.

There are certain problems with the feeds that WordPress provides, and whether these really are problems depends on your perspective. But let me tell you what I see from a coder’s point of view, and why you should consider listening to my argument.

Each of the feeds that WordPress provides is supplied by a distinctly separate script. Essentially, the inner workings of WordPress churn on each request and regurgitate a raw spew of data representing the last few posts that you’ve written. Then the script that you’ve requested - be it RSS, RDF, or Atom - figures out how to format that raw data into something that looks like a feed.

At it’s core, the problem is that these files are not easily maintained because they are separate. Why is this?

Every time you want to add functionality, like using a true GUID (Globally Unique ID - An identifier that is unique for each post in a feed), you need to change not just one file, but four.

When WordPress 2.0 was released, the approved/standard method of looping through each post was augmented. As such, all four feeds should have been updated with this new method to ensure that they are up to date. Were they all updated? No.

In fact, it has been a month since the last update to any of the feeds, in which a small glitch with a GUID was updated. (It was updated in other feeds, but not the one that was missed.) Before that, none of the feed scripts had been updated for four months. Surely software that has seen a major revision in that time should have had some review in this area. Maybe it was perfect. I’m not sure, but that seems strange to me.

People have said that RDF and RSS 0.92 are “simply too old”, and maybe that’s not fair. Some formats are being actively developed, but since our little blog software development world doesn’t seem to keep up with those innovations, we don’t see any benefit. Some folks would like to see Atom support upgraded from 0.3 to 1.0, and I am among them. It seems a disservice to WordPress to allow these feed formats to languish in disrepair, whether they are superceded by newer formats or not.

A seemingly separate issue: A suggestion that I agreed with on the wp-hackers mailing list earlier this month was additional support for comment metadata, including transferring comment handling into a centralized API. Currently, if you want to access comment information from inside plugin code, there isn’t an efficient method to get it apart from requesting it directly from the database. As a result, no filters can be applied to that information, and none of the information comes out of any cache, so it’s very difficult to write good, efficient plugins that affect comments.

Likewise, dealing with categories directly is a programmatic problem. Sure, you can do it, but it involves an arcane process that is best not attempted by any but veteran plugin developers. Function accessors to this data would prove an immense improvement for plugin development, if not for general overall efficiency and readability.

Making these changes is a good idea, but would affect large sections of code. It’s not something that would happen overnight. It could not be planned for the upcoming 2.0.1 service release, and I only suggested it as a future enhancement. Nonetheless, feeds may be significantly affected by these changes.

Changes in significant areas of the software such as these would require exponentially significant testing. And as might have been noticed during our pains from the 2.0 release, WordPress could stand to have a bit more thorough testing generally. It doesn’t need the strain of testing on formats that we don’t even have an developer advocate strong enough to volunteer for coding and testing. Not only would someone have to adopt each of the four feed files, but we would need testers to focus on the changes in those areas. I don’t see enough people who care - Maybe you’re out there?

In that vein, the suggestion was proffered that to reduce some of the development load we should retire some of the feed formats that aren’t as well used. That way, we can concentrate on perfecting the feeds that we do offer, improving the architecture in general, and opening up development to others who could supply their own feed formats. Sure, we who agreed this was a good idea were looking at the feeds as a method for a site owner to syndicate his site - not as a way for services to scrape data from your blogs.

One of the main complaints that I’ve heard is that removing RDF would cripple services that rely on the data it provides; that RDF is somehow more data-packed than the other formats available in a base install. I’m not sure how this is possible considering that they all are served the same data, but that’s irrelevant.

WordPress is at its core blogging software. It’s target user is a blogger. Do most bloggers know that they’re publishing their data in feeds that are being scraped by services who would be mad if those feeds were suddenly cut off? I’m not sure what service requires one of the feed formats we’re suggesting for retirement; one that can’t use one of the other formats. And as a blogger, are you really concerned about those services that you didn’t even know existed? I, for one, am unnerved that my data would be used that way, and look forward to turning RDF off on principle.

Besides that, nobody is suggesting that WordPress be restricted from producing RDF or any other of the feed types. It is good design to consolidate the method of producing feeds to reduce the overall code size, but that implies better, more flexible architecture. A new architecture would allow the production of additional even more robust feeds that the current architecture doesn’t support; something that we could do via plugin that we would have a hard time doing now. And certainly, just as with the geographic coordinate funtionality that was removed from WordPress 1.2 for 1.5, a plugin would be available upon release to reinstitute those feeds for those who wanted to keep them.

I’ve heard people say essentially, “It ain’t broke, don’t fix it.” Maybe not, but if we don’t fix it when we update other parts of WordPress, then it surely will break. Let’s fix it right and plan ahead to make it better instead of patching it all up piecemeal.

And as far as bogus theats of the Semantic Web people skipping out on WordPress for omitting a feed format are concerned, I can’t believe that anyone could find as flexible a product for producing the needed feeds as WordPress, even if WordPress had never produced RDF. (Hey, isn’t Semantic Web a pipe dream anyway?)

So where does that leave us? Lesser-used feeds should be retired from WordPress to benefit development of the feeds we keep and the coverage of code that needs testing. Having core code that all works is more important than making sure any pet formats are included.

In the end, we’ve only heard the word of one of the Automattic guys on the issue, and he’s not too enthusiastic about it. I’ve learned that in order to get anything major done in WordPress you need two things: Running code and commit dev support. We “anti-RDF” people have neither so far, but we’re still talking about a solution that works best for everyone. So let’s not all panic.

It sure would be helpful if people who care about these formats would step out and take care of them, though.