Displaying an RSS feed with PHP

Introduction

RDF Site Summary documents, or RSS, are an easy way to syndicate content onto your web site. More and more web sites are offering RSS feeds of their content that you can use to add substance to your own site.

RSS is an XML format that was designed by Netscape. Since their inception, RSS documents have gained wide popularity and are used by many major news portals and other web sites.

Today we will take a look at the basic structure of a RSS document and then show you how to use PHP to display that document on your site.

RSS Structure

Before we start grabbing news feeds from all over the place, we should have an understanding of how RSS is structured. That way, if we run into problems with our PHP code, we will be able to debug more easiily.

RSS is a XML application, so if you are at all familiar with XML, the RSS format should be very simple to you. (Even if you are not familiar with XML, you will notice at least some similarities with HTML.) If you would like to further research the structure of the RSS document, you can read the specifications at the following URL: http://web.resource.org/rss/1.0/spec.

Note: The techsoftcenter.com URLs used in the following examples do not actually exist.

XML Declaration

Every RSS document ought to begin with a XML declaration. This identifies the document as being XML. If you need to specify a particular type of encoding, you can do so within this element.

RDF Element

All the data for the document will go between the RDF tags. This element declares the namespace for RDF, so that all the other elements are recognized and treated correctly. The RDF element is opened with the tag:

Next comes all the other data for the document, ending with the closing RDF tag:

Channel Element

The channel element is our source of information about the site the RSS feed comes from. This element is found within the RDF element. It will commonly have information such as title, link, and description of the site. It can also contain a URL to an image for the site.

Within the opening channel tag itself, the attribute rdf:about must occur. This attribute normally is the URL of the RSS document itself.

Image Element

The image element is also found within the RDF element, and complements the image tag found in the channel element (above). If you specify the image element, you must also have an image tag in the channel element. Both must specify the same URL.

When using the image element, you must specify the rdf:about attribute. This should be the URL of the image. You must also specify a title, which will be used as the “alt text’ for the image.

Also required are a URL for the image to link to and a URL for the image itself. The URL specified here must correspond to the one provided in the rdf:about attribute.

Item Element

The item element is where we finally get into the meat of the content. Here you will find the individual items with their respective titles and descriptions. As with the previous elements, the item element requires the inclusion of the rdf:about attribute. This attribute should match the URL in the link sub-element.

Unlike the previously mentioned elements, there are normally multiple item elements. The accepted standard is to not exceed fifteen item elements.

Parsing an RSS document

Now that you have an understanding of RSS document structure, let’s take a look at some code that will parse and display it.

The following code is a collection of five functions that were added to the Code Gallery by uncleozzy. We will take a look at them one by one and examine how they work.

Globals

The functions we are about to look at require some variables in the global scope. The first thing that must be done is to declare and initialize those variables.

Function initArray()

This function initializes the $_item array by ensuring that all the proper keys are in place and that they all point to an empty string. The author makes very liberal use of this function.

This function will be called each time an opening tag is found for an image, item, or channel. It is also used after the closing tag of each and at the onset of the entire parsing routine. (This may be tending towards overkill.)

Function startElement()

When using the XML functionality of PHP, you must specify a function to be called each time an opening tag is encountered. This function serves that purpose.

As you can see by examining this function, if an opening tag is found for an item, channel, or image element, the initArray is called. Then, whether one of those opening tags was found or not, the$_depth array is incremented and the name of the opening tag is pushed onto the $_tags array.

Function endElement()

Just as you must specify a function to be called for opening tags, you must also specify a function that is called when a closing tag is found. The endElement function handles the actual display of our data. As each closing tag is encountered, this function displays the data that corresponds to that tag. First though, it pops the top element off of the $_tags array, then it decrements the $_deptharray. As mentioned earlier, it also calls the initArray function after displaying the pertinent data.

Function parseData()

This function is where the data is actually stored into the $_item array. When there is data that needs to be parsed, this function is invoked. Data is basically anything that is not an element tag.

The first thing this function does is determine if the data is only whitespace. If it is, it does not bother to store it in the array. If the data does contain useful information, it is stored in the $_itemarray.

Function parseRDF()

This function is the wrapper function for all the others. When using this function, you do not need to worry about calling any others.

It starts off by creating the parser, and then it calls the initArray function. The functions for handling the opening and closing tags, and the data itself, are then registered. Next, the function proceeds to open the file specified, parse it, and verify that it is a valid RSS document. The file is then closed and the memory from the parser freed.

Example Use

As I mentioned earlier, using this set of functions is as easy as calling the parseRDF function. All you need to do is pass a URL for the RSS document, or a path to a local file..

Conclusion

Parsing and displaying RSS feeds on your web site has never been easier than with this set of functions. As you probably noticed, you can customize the display of the feed by modifying theendElement function. You can easily adapt items from the RSS feed to the look and feel of your site.

The next thing to think about is how to create your own RSS feed. With the basic RSS document structure under your belt, this should not be too difficult.