A major item that a lot of sites overlook and a lot of SEO’s don’t really understand is the “site search” function on a site. I’m going to give you a scenario that three of our clients have faced, with similar results.
Let’s say you own a real estate site with information being pulled from the MLS database via IDX or some other listing server. None of the content that your on-site search result displays is in any way unique from all the other real estate sites online. Which means that even if you have 20,000 pages indexed according to Yahoo, not even 1% of these pages have a great chance of showing up unless someone searches on the exact title of your listings page.
This idea also holds true for paginating issues in shopping carts and blog pages, but we won’t be delving into those two items in this post. So you have two choices basically (I’m assuming your site isn’t homes.com, realtor.com, or one of the large nationwide realty sites).
1. You can write unique titles, meta descriptions and listing descriptions for each listing, which would take you the rest of your life.
2. You can exclude all your search results from the search engine spiders by adding an exclusion for your on site search results subfolder to your robots.txt file.
What if your search results get displayed using something like this: www.yoursite.com/index.php?search=action&name=something
Can you use the robots.txt file to exclude that search result, or any search result on the index.php page, without also excluding your index page from the bots? The answer in this example is no. But, if you’re going to be de indexing your search pages anyway, you can move your search results to display under a separate subfolder called /search/ or something and then using your robots.txt file, you can make all of your search listings no indexed, while still retaining one main search page, perhaps at search.html. The exclusion would look exactly like this in your robots.txt file: