Experimenting with Googlebot

In my previous post ‘Blogs are fundamentally flawed…‘ I noted an observation that more often than not search results would direct a user to an index-style page containing the post instead of directly to the ‘permalink’ location of the post. This leads to a poor user-experience from the visitor’s point of view, on busy blogs the post has almost certainly moved since the page was spider’d. Google in particular appeared to be the worst for it.
Discussions on the subject with Gerry determined that this is most likely down to Google’s PageRank technology; where index-style pages have a higher value than the post pages themselves. To get around this he suggested manipulating ‘robots.txt’ directives within the index-style pages.
On Google’s “Information for Webmasters” help page I found they look for special ‘robots.txt’ directives and meta tags in documents when spidering specific to Googlebot only. This meant I could single out Googlebot for these directives and not affect other search engines (which don’t exhibit the problem so much).
I basically want Google to ‘FOLLOW’ links on all pages, but not to ‘INDEX’ the index-style pages like categories & archives by date. The desired effect being that Google can find all posts as before but simply ignore the index-style pages themselves. Implementing this is quite simple; I modified my theme’s “header.php” file inserting the following code in the “head” section:
[php]< ?php if ( !is_single() && !is_page() && !is_home() ) echo " \n”;
?>[/php]
This reads almost literally, if this is not a single post view, not a page view or the home page, add the following “meta…” tag. Although the home page is an index-style page I am reluctant to add ‘NOINDEX’ because I don’t want it disappearing from search results. 😉
Now the long wait for the changes to reflect in Google’s results.
Updated 24th January 2006 – Gerry pointed out this can be optimised using De Morgan’s Law 😛
[php]< ?php if ( ! (is_single() || is_page() || is_home()) ) echo " \n”;
?>[/php]


Comments

2 responses to “Experimenting with Googlebot”

  1. >- Gerry pointed out this can be optimised using De Morgan’s Law
    any examples of what he means by that? Demorgans laws are a way to switch from postive to negative logic or vice versa.
    Reasons to use Demorgans laws is to reduce the condition set. if(reduce stuff here) Most times you end up using a truth table or better techniques like a kmap for hard problems however some are trivial like the following:
    Serial line detector: based on the value 0 or 1 and the current state a specific operation needs to happen.
    This fits positive logic
    Parellel line detector: where several bits are considered + current state
    negative logic is more appropriate.
    If done with positive logic and depending on how many bits, the designer would have to write every possible case when to change to the next state. However using negative logic the designer only has to explicitly state which cases (which will be a lot fewer ) would cause the circuit to stay on the current state.
    By the way very nice content u got here. (favorite worthy)

Leave a Reply

Your email address will not be published. Required fields are marked *