Content Syndication

published 2009-06-12

The internet is an indispensable resource for people, providing quick access to everything from news and stocks, to weather, and sports. Many people bookmark the sites they read on a regular basis, and making the rounds to read these sites constitutes a daily ritual, whether it’s over morning coffee or during their workday lunch break. Keeping up with all the sites in your bookmarks can be a daunting task. Some websites update on regular schedules, others update randomly throughout the day, and still others update so infrequently that you never know when an update will occur. Various solutions have been put to the test in the past, ranging from the failed Pointcast news-delivery screensaver, to notify-by-email systems popular on many professional news sites.

For many folks who spend most of their day online content syndication is the only way to keep up with the tide of information updates. Rather than manually browsing all of your bookmarked sites, you use a computer program to do all that boring work for you, and prepare a complete list of all the latest information. If a website hasn’t updated yet, you don’t need to waste your time checking it. If all of your favorite websites have updated, you can read all of that fresh content in a single sitting, without waiting for the pages to load or looking at the annoying advertisements. And you can be more efficient by skimming the headlines of new stories to decide if it’s something you really want to read now, or save for later, or skip altogether.

You may have seen these orange buttons on websites: . These buttons are links to eXtensible Markup Language, or XML, versions of the web pages we read. Although XML documents can be read by human beings (with a little effort), they’re designed to be read and processed by computer programs. Clicking them in your browser will produce various results, depending on what web browser you’re using. The links aren’t really for readers, but for programs called aggregators.

Aggregators, or feed readers as they’re also called, are the programs that do all the work of visiting and collecting web updates for you. When new content is available, the aggregator fetches the XML data and makes it available for you to read in a friendly way. Aggregators generally list new items first, so you can quickly skim all your feeds, or subscribed sites, in the order in which they were updated. There are many different aggregators available.

You, the reader, tell your aggregator that you want to subscribe to a site’s feed. Most of the time, you simply enter the URL of the website you want to follow and your aggregator will automatically locate the XML version of the content. Then, the aggregator will periodically check for and fetch any new updates.

Some aggregators are programs that run on your computer. For these to work, your desktop or laptop computer needs to be on and connected to the internet. If you use an aggregator on your PC at home and on your PC at work, you may have to skip past the content you’ve already read, which rather defeats the purpose of aggregating the content to begin with! Thankfully, there exist also several web-based solutions. The advantage to using a web-based aggregator like Google Reader, Bloglines or NewsIsFree is that their computers do all the work of polling websites and obtaining updates, and you can read your list of feeds from wherever you may be using nothing more than your web browser. This means that you can check your list of feeds while eating breakfast at home, and then see a complete list of updated information during your lunch break at the office. Web-based aggregators also work well with smartphones, allowing you to keep up with your feeds while on the go!

Content syndication saves website readers time, but it can save website operators money. Visitors using a web browser to read a website request and receive the entire site every time: text, graphics, layout information, advertising, and anything else that might be on (or in) the page. Web browsers can cache (or “remember”) some of this information, but there are a variety of reasons why this doesn’t always work, and the visitor ends up downloading most everything from that page every time they visit, regardless of whether there’s anything new (actually, advertising often causes this, as the advertisements change every time the page is loaded). Aggregators visiting a site’s feed first check the timestamp of the XML feed: if it hasn’t been updated since the last visit then the aggregator immediately stops. When updates are available, aggregators will receive only the content from the website, without advertising, background images, navigation buttons, and the like. Combined, these can prevent vast amounts of unnecessary traffic, allowing content producers to reach their audience without incurring astronomically expensive web hosting bills.

Several tangential benefits also arise from using content syndication. First, syndication-specific search engines look through feeds, allowing you to maintain a constantly-updated list of links to information in as close to real-time as currently possible. Second, the machine-readable format of feeds makes it possible to create mashups using syndicated data – much easier than personally visiting the pages to copy-and-paste the bits you want. Third, syndication can be used to include content from other sites into your own website.

Of course, there are plenty of challenges with content syndication. The biggest challenge is the machine language format used for the feeds. Although feeds are in the XML language, there are a variety of popular dialects to that language. Some feed-reading programs can speak them all, while others are limited to just one or two. There’s a joke that succinctly explains the situation: “The great thing about standards is that there are so many to choose from!”

The most common syndication format is RSS (which stands for Really Simply Syndication, or Rich Site Summary, or maybe RDF Site Summary), which is itself a little misleading because there are nine different types of RSS. The history of RSS is complicated, with several competing parties vying to establish the definitive standard. In common practice, only two or three of these formats are regularly used, but even that’s too many.

The other dominant syndication format is Atom, which is a community-driven format that is trying to avoid many of the perceived shortcomings of RSS. Atom isn’t (yet) as widely-supported as RSS, but it’s quickly gaining ground. Lengthy discussions wage on about these competing standards.

Thankfully services exist that translate syndication formats, so you don’t need to worry about which format is winning the debate. Web-based feed readers should all be smart enough now to handle any feed format, so you the reader shouldn’t need to worry about this too much. If you’re a content producer, it’s worth spending some time to familiarize yourself with the various formats, so that you know what you’re offering to your readers.

Another issue with content syndication is that the syndication source (ie: the website offering the feed) chooses whether to provide the full content of new posts, or just an excerpt. Advertising-driven websites will often provide just an excerpt, or teaser, to tickle your fancy in order to get you to load the webpage in your browser and thus see the ads that are displayed there. Some such sites will send out the first couple of sentences for their feeds, which may or may not provide enough information for you to determine whether it’s worth your time to follow the link to the story. Other sites will carefully craft meaningful summaries of new items which you can quickly skim and decide whether to read the in-depth report.

Many syndication feeds come from personal weblogs, but big businesses are recognizing the value of the technology. Reuters offers feeds for its news items. The BBC offers categorized news feeds. Microsoft offers feeds for developer resources. Apple offers RSS feeds for its iTunes Music Store to display new releases and top rated songs or albums.

I’ve been using Google Reader as my aggregator for some time now, and have been thoroughly pleased with it. It offers a nice suite of features, good performance, and the traditional simple Google user interface. If you’re not yet using an aggregator, or are unhappy with the one you’re using, consider giving Google Reader a shot.

What are you waiting for? Start aggregating!