<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=763991110377089&amp;ev=PageView&amp;noscript=1">

All About Site Indexation and the Crawl Toward Real-Time Search

Aug 14, 2009 / by Scott Cowley

stopwatchRecently, there has been a lot of buzz about real-time search, but is it necessary? First, let's look at the current state of search and crawl.*

Unless your site is decidedly authoritative, like CNN.com, you're likely to get crawled as Google indexes more authoritative sites that are linking to your own. Your site will end up on a particular crawling schedule.

The lengthening or shortening of the crawl schedule, with blogs especially, is largely determined by the amount of new content found on the site each time it's crawled. In the chart below, the diagonal lines represent getting crawled by the search engine and the ominous black spots represent posting new content. In this case, if you haven't posted in a while, you've probably worked up a fairly large interval between crawls. If you suddenly return to posting on a consistent schedule, over time the crawl interval will be narrowed until your content gets indexed soon after posting.

In essence, you can and should train Google to index your site more frequently by posting new content regularly or by getting new backlinks to your site.

postandindex

Real-time indexation is just what it sounds like. Content is indexed and searchable immediately upon publication. None of the big three engines are there yet.

realtimepostandindex

Is real-time indexing by search engines (and hence real-time search) inevitable? It's starting to appear so.

Twitter is already considered to be real-time, though it's far from a genuine search engine. Microsoft seems to have tweaked Bing to place higher value on more recent news. In tests, Google Caffeine, the new under-infrastructure version of the search giant, seems to be indexing a lot more pages and giving higher placement to the newest content than the current version. And Facebook's FriendFeed acquisition suggests they're definitely eyeing the real-time search space.

Real-time search helps anybody who reads or writes content with a short shelf-life. If you post about an in-progress disaster, a celebrity death, or a limited-time offer, your content is hot one minute, cold the next, so quick indexation by search engines means that your content will be found while it's still relevant. You would probably gain a good amount of site traffic just by riding the wave and capitalizing on long-tail searches, regardless of how frequently you post.

The real-time search goal has plenty of obstacles. Real-time indexation takes a mountain of data computation power. Plus, algorithmically, how do you consistently showcase an on-scene Twitterer's play-by-play updates over the Huffington Post side commentary during a crisis? Or do you? You can't use backlinks as a determinant. Authority is negligible. One practical solution would be to house real-time search separate from regular search, just like Google News is separate from the primary index. Regardless, real-time search is only as valuable as the relevance of the top-ranking content and is likely to look different from today's version.

Until we get there, the most important thing you can do now is get your site as close as possible to real-time indexation using the available SEO techniques.

  • Create good content on a consistent schedule, applying other relevant SEO tactics to optimize your site, and building up your authority
  • Create sitemaps for your site so search engines know which pages to crawl
  • Use NoFollow tags on non-critical pages as a way of shining a light on the more important ones
  • Submit your site and content to directories and social bookmarking sites
  • Work on building links from more authoritative sites pointing to your own

*For clarification, crawling (or spidering) is the method search engines use to populate their data repositories so people can search using their websites. It involves running programs called bots (or spiders) that go from link to link scouring web pages and returning information to be indexed.

Topics: Blog

Scott Cowley

Written by Scott Cowley

He graduated from BYU with a B.S. in marketing and returns occasionally to guest lecture on social media, blogging, and SEO. He has experience managing successful SEO campaigns for a variety of clients. He is very involved with Social Media Club in Salt Lake City.

Let us improve your online marketing results

We have increased traffic, leads, and sales for well-known companies—including Dell, Mrs Fields Cookies, Hotels.com and H&R Block.

Plus for hundreds of local smaller companies like dentists, plumbers, dermatologists, etc.

Find out how to work with us  

Subscribe to Email Updates

Lists by Topic

see all

Posts by Topic

see all