GreyBeard Inc.

    
    
     

Clean URLs, mod_rewrite, and PHP

     Clean URLs are nice. And it's nice to be nice. So we always use clean URLs. Most rewrite examples found online look something like this:

RewriteRule ^topic/banana/$ filename.php?topic=banana

This puts an entry in the $_GET array with the name "topic" and the value "banana". Very nice. But it does not scale well.  For a big site with a lot of possible URL combinations the amount of rules can be problematic, and just a pain to keep having to adjust or add to. And unless I am mistaken, the more rules you have the more overhead for mod_rewrite (generally speaking). We have to within a web application or site analyze the URL arguments in PHP anyway, so using a more general type of rewrite regex we can use a nicer approach. The word for today by the way appears to be nice.

     The idea is by no means revolutionary but I have found it to be an indispensable way to handle clean urls for a site. So much so that it becomes easier to code all the urls at the site in a clean style. The following illustrates rewrite rules that supply up to 6 url arguments to PHP with the names a,b,c,d,e,f in the $_GET array.

RewriteRule ^([^/]+)/$ index.php?a=$1
RewriteRule ^([^/]+)/([^/]+)/$ index.php?a=$1&b=$2
RewriteRule ^([^/]+)/([^/]+)/([^/]+)/$ index.php?a=$1&b=$2&c=$3
RewriteRule ^([^/]+)/([^/]+)/([^/]+)/([^/]+)/$ index.php?a=$1&b=$2&c=$3&d=$4
RewriteRule ^([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/$ index.php?a=$1&b=$2&c=$3&d=$4&e=$5
RewriteRule ^([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/$ index.php?a=$1&b=$2&c=$3&d=$4&e=$5&f=$6

     These rules make the assumption that the url ends in a trailing slash, so be sure to include the appropriate matching rule to append it (or adjust this pattern). Adding additional URL arguments is easy, just keep extending the pattern by one "/([^/]+)/" and adding g=$7, h=$8 etc to the end of the line.

This type of system requires that the order of the url arguments be maintained and used as a method to understand their values, rather than using the elements index name within the $_GET array. We can use both static keywords and dynamic values to form a URL that is logical and easy to understand.  For example:

/products/kitchen/gravy_boat/

In this case we can code logic to check for products in $_GET['a'] and if found analyze $_GET['b'] for a known product category. We know "kitchen is intended to be a category because its on the url directly after the products value (this being something that is decided when the URLs are thought out). if thats the end of the URL we can display a list of the products in the kitchen category, but we can also check for a valid product in $_GET['c'] and if found display instead the product detail page for that item. So effectively with only 3 URL arguments we can easily provide 3 different pages, a /products/ page that contains the top level "all products' or sale products or whatever, the product category list, and the product detail page.

    Using this URL rewrite method the URL structure of the site becomes more strictly enforced, which in the end makes the URLs themselves more logical and hence more meaningful. I find that using the first URL argument value as the primary "switch" for different sections of a site to be both logical for the programmer and for the site visitor who's eyes happen to wander up to the address bar (or to googlebot!).

Nice.


Images
No Images with this post
Comments
No comments posted yet

Add a comment


Name:
Email:
Subject:
Comment:
Security Image:
security image
Enter the letters you see above.