All About Site Indexation and the Crawl Toward Real-Time Search

stopwatchRecently, there has been a lot of buzz about real-time search, but is it necessary? First, let’s look at the current state of search and crawl.*

Unless your site is decidedly authoritative, like CNN.com, you’re likely to get crawled as Google indexes more authoritative sites that are linking to your own. Your site will end up on a particular crawling schedule.

The lengthening or shortening of the crawl schedule, with blogs especially, is largely determined by the amount of new content found on the site each time it’s crawled. In the chart below, the diagonal lines represent getting crawled by the search engine and the ominous black spots represent posting new content.  In this case, if you haven’t posted in a while, you’ve probably worked up a fairly large interval between crawls. If you suddenly return to posting on a consistent schedule, over time the crawl interval will be narrowed until your content gets indexed soon after posting.

In essence, you can and should train Google to index your site more frequently by posting new content regularly or by getting new backlinks to your site.

postandindex

Real-time indexation is just what it sounds like. Content is indexed and searchable immediately upon publication. None of the big three engines are there yet.

realtimepostandindex

Is real-time indexing by search engines (and hence real-time search) inevitable? It’s starting to appear so.

Twitter is already considered to be real-time, though it’s far from a genuine search engine. Microsoft seems to have tweaked Bing to place higher value on more recent news. In tests, Google Caffeine, the new under-infrastructure version of the search giant, seems to be indexing a lot more pages and giving higher placement to the newest content than the current version. And Facebook’s FriendFeed acquisition suggests they’re definitely eyeing the real-time search space.

Real-time search helps anybody who reads or writes content with a short shelf-life. If you post about an in-progress disaster, a celebrity death, or a limited-time offer, your content is hot one minute, cold the next, so quick indexation by search engines means that your content will be found while it’s still relevant. You would probably gain a good amount of site traffic just by riding the wave and capitalizing on long-tail searches, regardless of how frequently you post.

The real-time search goal has plenty of obstacles. Real-time indexation takes a mountain of data computation power. Plus, algorithmically, how do you consistently showcase an on-scene Twitterer’s play-by-play updates over the Huffington Post side commentary during a crisis? Or do you? You can’t use backlinks as a determinant. Authority is negligible. One practical solution would be to house real-time search separate from regular search, just like Google News is separate from the primary index. Regardless, real-time search is only as valuable as the relevance of the top-ranking content and is likely to look different from today’s version.

Until we get there, the most important thing you can do now is get your site as close as possible to real-time indexation using the available SEO techniques.

  • Create good content on a consistent schedule, applying other relevant SEO tactics to optimize your site, and building up your authority
  • Create sitemaps for your site so search engines know which pages to crawl
  • Use NoFollow tags on non-critical pages as a way of shining a light on the more important ones
  • Submit your site and content to directories and social bookmarking sites
  • Work on building links from more authoritative sites pointing to your own

*For clarification, crawling (or spidering) is the method search engines use to populate their data repositories so people can search using their websites. It involves running programs called bots (or spiders) that go from link to link scouring web pages and returning information to be indexed.

Get Internet Marketing Insight For Your Company - SEO.com

6 Comments

  1. Jacob Stoops says

    Real-time search & indexing is an interesting concept and it most certainly is the future. I wonder how search engines will know how or where to rank a page/article immediately on publish?

    Ought to be fun to watch.

  2. Matt Inertia says

    Nice article!

    Not got anything to add really other than, dont add nofollows to internal links. Try and accomplish the same by simply changing your site navigation.

  3. Scott Cowley says

    @Jacob: That’s a great question, which is why I think they shouldn’t be lumped together. I wonder if real-time really is the future though. It’s certainly valuable and the hype machine has convinced us that we want it, but just like social media, it has the potential to be a major distraction.

    @Matt: Optimizing the site navigation is a great suggestion. Using NoFollow on internal pages doesn’t help channel link juice, as we’ve heard, but for indexing purposes it means that the spider will at least index the ones that matter by blocking the ones that don’t.

  4. Alex says

    HI Mate,

    This would be the End of SEO in a year from Now. As G is targeting Social + Local + Personalized Results. Coz there won’t be any point for optimizing twitter real time results as it won’t stay on the pages for long, each minutes the links would be changing.

    For SEO its dead end but for USERS its awesome!!
    Local results + blog results + real time results + image results + video results combine would have a huge impact on rankings.

    Google would 100% determine all the queries by local trends and highly generated keyword research traffic which goog want to show up in the first box results.

    Moving to social is the only solution and optimizing it to get traffic. Rankings is dead, direct traffic and communicating real time to real time users which is really a big work.

    Directories will fade away
    LINK building will fade away
    SOCIAL & Content is the king.

    Regards
    Alex

  5. Alex Zagoumenov says

    Scott, thx a lot for an easy to follow and valuable write-up! Are nofollow attributes still critical to direct the juice flow today? Thanks in advance!

  6. says

    I agree I think SEO as we know it will not die but certainly take a bit of a back seat in the years to come. If you think about though search engine results should be made up of the best most popular sites not those that have the most keywords or backlinks etc.

Leave a Reply