Most of the popular web search engines prioritize the sites that they index by two major criteria: content and links. The indexable content (i.e. content viewable in the source code) gives the search engine spiders material which they can store in their databases and sort according to relevance. The links within a site guide the spiders throughout the site’s architecture. In many instances, web developers ignore the importance of links in creating a site that will bring them search engine results and increased traffic.
Links and Site Structure
A major factor in developing a web site’s architecture is to make sure that each page within the site links to every other page, either directly or indirectly. While most developers include this feature when designing the site navigation system, some pages may inadvertently be left “orphaned”. If a page within the site does not have a link on another page to connect it to the main site, the spiders cannot reach it.
Links and Javascript
Some developers prefer to generate links with Javascript routines rather than with strict HTML. While these Javascript links may improve the user experience, they can hamper the site’s search engine ranking. Since spiders work on the basis of the most technically primitive browsers, they will either fail to parse some Javascript methods, or give these links less significance than they may deserve.
Links and Flash
Many designers and developers spend dozens of man-hours creating the most dynamic, most eye-catching and most dazzling Flash animations. These interactive movies can bring a site to life for a visitor. However, they can also be death to a search engine spider. Spiders typically don’t “see” Flash (or other dynamic browser plug-ins), so they won’t follow any links contained within the animations. A primitive site, with strong content and logical links, can bring in traffic rivaling that of the 1977 premiere of “Star Wars”, while a site with better effects can earn less respect from search engines than audiences gave Jar Jar Binks.
Links and Robots
One of the more problematic issues facing both web site developers and SEO marketers is the use of the “robots.txt” file. This file allows the developer to permit or block search engine spiders from crawling specific files or directories within a site. For instance, if the developer wanted to block a spider from crawling its error pages, the robots.txt file would look like this:
User-agent: *
Disallow: /error
The developer can also allow a specific spider (e.g. Googlebot) to index a file or folder, while shutting out the other spiders. The robots.txt file also permits users to set permissions for files within a specific directory:
User-agent: Googlebot
Disallow: /comics/DC/
Allow: /comics/DC/Batman.html
To prevent spiders from crawling a specific page, developers can add content to the “robots” meta tag:
<meta name=”robots” content=”no-index,no-follow”>
The cardinal rule for SEO links should be, “When in doubt, spell it out.” The spiders will index the links, the site will gain more traffic, the client will make more money, and the developer will be in high demand. Everybody wins!