Understanding the Search Services
Search services can generally be categorized into two types of sources: directories and search engines. Many people confuse the two terms, often referring to Yahoo! as a search engine. (Yahoo! is a directory.)
The reason for the confusion is understandable. People see a Search button on a web site and assume that when they click the button, they are using a search engine. Both Yahoo! and Google have search boxes, as shown in Figure 1.1.
Figure 1.1 Although both Yahoo! and Google enable people to search, the information they provide in their search results is different.
The search services use two main sources to obtain their listings. The first type of search service is called a directory, and a directory uses human editors to manually place web sites or web pages into specific categories. A directory is commonly called a "human-based" search engine.
The other type of search service is called a search engine, and a search engine uses special software robots, called spiders or crawlers, to retrieve information from web pages. This type of search service is called a "spider-based" or "crawler-based" search engine.
Many search services are a hybrid of a search engine and a directory. A hybrid search service usually gets most of its listings from one source; thus, hybrid search services are classified according to the main source used. If a hybrid search service gets its primary results from a directory and its secondary results from a search engine, the search service is generally classified as a directory.
MSN Search is classified as a directory. Its primary results come from the LookSmart database, and its secondary (fall-through) results currently come from Inktomi, a search engine.
Most search engine marketers label both search engines and directories as "search engines," even though search engines and directories have unique characteristics. Web site owners need to understand the differences between the two terms because the strategies for getting listed well in search engines are quite different from the strategies for getting listed well in directories.
What differentiates a search engine from a directory is that the directory databases consist of sites that have been added by human editors. Search engine databases are compiled through the use of special software robots, called spiders, to retrieve information from web pages.
Search engines perform three basic tasks:
Search engine spiders find and fetch web pages, a process called crawling or spidering, and build lists of words and phrases found on each web page.
Search engines keep an index (or database) of the words and phrases they find on each web page they are able to crawl. The part of the search engine that places the web pages into the database is called an indexer.
Search engines then enable end users to search for keywords and keyword phrases found in their indices. Search engines try to match the words typed in a search query with the web page that is most likely to have the information for which end users are searching. This part of the search engine is called the query processor.
How do search engines begin finding web pages? The usual starting points are lists of heavily used servers from major Internet service providers (ISPs), such as America Online, and the most frequently visited web sites, such as Yahoo!, the Open Directory, LookSmart, and other major directories. Search engine spiders will begin crawling these popular sites, indexing the words on every single page of a site and following every link found within a site. This is one of the major reasons it is important for a web site to be listed in the major directories.
What Is a URL?
A uniform resource locator (URL) is an address referring to the location of a file on the Internet. In terms of search engine marketing, it is the address of an individual web page element or web document on the Internet.
Many people believe a URL is the same as a domain name or home page, but this is not so. Every web document and web graphic image on a web site has a URL. The syntax of a URL consists of three elements:
The protocol, or the communication language, that the URL uses.
The domain name, or the exclusive name, that identifies a web site.
The pathname of the file to be retrieved, usually related to the pathname of a file on the server. The file can contain any type of data, but only certain files, usually an HTML document or a graphic image, are interpreted directly by most browsers.
For example, the URL for a home page is commonly written as follows: http://www.companyname.com/index.html.
The http:// is the protocol (Hypertext Transfer Protocol).
The http://www.companyname.com is the domain name.
The index.html is the pathname. In this example, it is a Hypertext Markup Language (HTML) document named index.
The URL for an About Us page for a company called TranquiliTeas is commonly written as this: http://www.tranquiliteasorganic.com/ about.html.
The http:// is the protocol.
The http://www.tranquiliteasorganic.com is the domain name.
The about.html is the path name.
As a general rule of thumb, whenever you see Add URL or Submit URL to the search engines, remember that every web page has a unique URL.
Figure 1.2 outlines the search engine crawling process for a single web page.
Figure 1.2 How search engines crawl web pages.
Because search engine spiders are continuously crawling the web, their indices are constantly receiving new and updated data. Search engines regularly update their indices about every four to six weeks.
The search engine index contains full-text indices of web pages. Thus, when you perform a search query on a search engine, you are actually searching this full-text index of retrieved web pages, not the web itself.
To determine the most relevant URL for a search query, most search engines take the text information on a web page and assign a "weight" to the individual words and phrases on that page. An engine might give more "weight" to the number of times that a word appears on a page. An engine might assign more "weight" to words that appear in the title tags, meta tags, and subheadings. An engine might assign more "weight" to words that appear at the top of a document. This assigning of "weight" to a set of words on a web page is part of a search engine's algorithm, which is a mathematical formula that determines how web pages are ranked. Every search engine has a different formula for assigning "weight" to the words and phrases in its index.
Search engine algorithms are kept highly confidential and change almost every day. Thus, no search engine optimization expert can ever claim to know an exact search engine algorithm at a specified point in time.
Submission Forms Versus Natural Spidering
Search engines also add web pages through submission forms, generally labeled as Add URL or Submit URL. The Submit URL form enables web site owners to notify the search engines of a web page's existence and its URL.
Unfortunately, unethical search engine marketers (called spammers) created automated submission tools that bombard submission forms with thousands of URLs. These URLs point to poorly written and constructed web pages that are of no use to a web site owner's target audience.
Most of the major search engines state that 95 percent of submissions made through the Add URL form are considered spam.
Because of the overwhelming spam problems, submitting a web page through an Add URL form does not guarantee that the search engines will accept your web page. Therefore, it is generally more beneficial for web pages to be discovered by a search engine spider during its normal crawling process.
However, a search engine optimization expert can do the following:
Ensure that targeted words and phrases are placed in a strategic manner on the web pages, no matter what the current algorithms are.
Ensure that spiders are able to access the web pages.
The key to understanding search engine optimization is comprehending Figure 1.2. Why? Because search engine spiders are always going to index text on web pages, and they are always going to find web pages by crawling links from web page to web page, from web site to web site. Anything that interferes with the process outlined in Figure 1.2 will negatively impact a site's search engine positions. If a search engine spider is not able to access your web pages, those pages will not rank well. If a search engine can access your web pages but cannot find your targeted keyword phrases on those web pages, those pages also will not rank well.
With a pay-for-inclusion model, a search engine includes pages from a web site in its index in exchange for payment. The pay-for-inclusion model is beneficial to search engine marketers and web site owners because (a) they know their web pages will not be dropped from a search engine index, and (b) any new information added to their web pages will be reflected in the search engines very quickly.
This type of program guarantees that your submitted web pages will not be dropped from the search engine index for a specified period of time, generally six months or a year. To keep your guaranteed inclusion in the search engine's index, you must renew your payment.
Submitting web pages in a pay-for-inclusion program does not guarantee that the pages will appear in top positions. Thus, it is best that pages submitted through pay-for-inclusion programs be optimized.
Search engine marketers find pay-for-inclusion programs save them considerable time and expense because a web page cannot rank if it is not included in the search engine index. Furthermore, pay-for-inclusion programs enable dynamic web pages to be included in the search engine index without marketers having to implement costly workarounds.
In contrast to pay-for-inclusion models, a pay-for-placement search engine guarantees top positions in exchange for payment. With pay-for-placement search engines, participants bid against each other to obtain top positions for specified keywords or keyword phrases. Typically, the higher the bid, the higher the web page ranks.
Participants are charged every time a person clicks through from the search results to their web sites. This is why pay-for-placement search engines are also referred to as "pay-per-click" search engines. Participants pay each time a person clicks a link to their web site from that search engine.
Many pay-for-placement search engines have excellent distribution networks, and the top two or three positions are often displayed in other search engines and directories. Paid placement advertisements are generally marked on partnered sites as "Featured Listings," "Sponsored Links," and so on.
If no one bids on a particular search term, the free, fall-through results are generally displayed from a search engine partner. For example, currently, the fall-through results for Overture.com come from Inktomi.
Participating in pay-for-placement programs can get expensive. Part 3, "Page Design Workarounds," discusses how to best utilize this type of service.
Search Engine Optimization Strategies
Search engine optimization is the process of designing, writing, coding (in HTML), programming, and scripting your entire web site so that there is a good chance that your web pages will appear at the top of search engine queries for your selected keywords. Optimization is a means of helping your potential customers find your web site.
To get the best overall, long-term search engine visibility, the following components must be present on a web page:
All the major search engines (Google, FAST Search, MSN Search, and other Inktomi-based engines) use these components as part of their search engine algorithm. Figure 1.3 illustrates the "ideal" web page that is designed and written for the search engines.
Figure 1.3 Known search engine algorithm components: text, link, and popularity.
Very few web pages can attain the "ideal" match for all search engine algorithms. In reality, most web pages have different combinations of these components, as illustrated in Figure 1.4.
Figure 1.4 Web site comparisons.
Sites perform well in the search engines overall when they have (a) all the components on their web pages and (b) optimal levels of all the components.
Text ComponentAn Overview
Because the search engines build lists of words and phrases on URLs, it naturally follows that to do well on the search engines, you must place these words on your web pages in the strategic HTML tags.
The most important part of the text component of a search engine algorithm is keyword selection. For your target audience to find your site on the search engines, your pages must contain keyword phrases that match the phrases your target audience is typing into search queries.
After you have determined the best keyword phrases to use on your web pages, you will need to place them within your HTML tags. Different search engines do not place emphasis on the same HTML tags. For example, Inktomi places some emphasis on meta tags; Google ignores meta tags. Thus, to do well on all the search engines, it is best to place keywords in all the HTML tags possible, without keyword stuffing. Then, no matter what the search engine algorithm is, you know that your keywords are optimally placed.
Keywords need to be placed in the following places:
Visible body text
Graphic images (the alternative text)
The title tag and the visible body text are the two most important places to insert keywords because all the search engines index and place significant "weight" on this text.
Keywords in Your Domain Name
Many search engine marketers believe that placing keywords in your domain name and your filenames affect search engine positioning. Some search engine marketers believe that this strategy gives a significant boost whereas others believe that the boost is miniscule.
One reason people believe the position boost is significant is that the words or phrases matching the words you typed in a query are highlighted when you view the search results. This occurrence is called search-term highlighting or term highlighting.
Search engines and directories might use term highlighting for usability purposes. The process is done dynamically using a highlighting application. This application simply takes your query words and highlights them in the search results for quick reference. Term highlighting merely indicates that query terms were passed through the application. In other words, in search results, just because a word is highlighted in your domain name does not necessarily mean that the domain name received significant boost in search results.
Many other factors determine whether a site will rank, and the three components (text, link, and popularity) have more impact on search engine visibility than using a keyword in a domain name.
Link ComponentAn Overview
The strategy of placing keyword-rich text in your web pages is useless if the search engine spiders have no way of finding that text. Therefore, the way your pages are linked to each other, and the way your web site is linked to other web sites, does impact your search engine visibility.
Even though search engine spiders are powerful data-gathering programs, HTML coding or scripting can prevent a spider from crawling your pages. Examples of site navigation schemes that can be problematic include the following:
Poor HTML coding on all navigation schemes: Browsers (Netscape and Explorer) can display web pages with sloppy HTML coding; search engine spiders are not as forgiving as browsers.
Image maps: Many search engines do not follow the links inside image maps.
Frames: Google, Inktomi, and Lycos follow links on a framed site, but the manner in which pages display in search results are not ideal.
Dynamic or database-driven web pages: Pages that are generated through scripts or databases, or that have a ?, &, $, =, +, or % in the URL, pose problems for search engine spiders. URLs with CGI-BIN in them can also be problematic.
Flash: Currently, only Google and FAST Search can follow the links embedded in Flash documents. The others cannot.
Therefore, when designing web pages, be sure to include a navigation scheme so that the spiders have the means to record the words on your web pages. Usually that means having two forms of navigation on a web site: one that pleases your target audience visually and one that the search engines spiders can follow.
Popularity ComponentAn Overview
The popularity component of a search engine algorithm consists of two subcomponents:
Click-through or click popularity
Attaining an optimal popularity component is not as simple as obtaining as many links as possible to a web site. The quality of the sites linking to your site holds more weight than the quantity of sites linking to your site. Because Yahoo! is one of the most frequently visited sites on the web, a link from Yahoo! to your web site carries far more weight than a link from a smaller, less visited site.
To develop effective link popularity to a site, the site should be listed in the most frequently visited directories. Yahoo!, LookSmart, and the Open Directory are examples of the most frequently visited directories.
More importantly, it can boost your search engine position if a directory that is associated with a search engine lists your site. For example, a site that is listed in LookSmart can be given higher visibility in an MSN Search.
Obtaining links from other sites is not enough to maintain optimal popularity. The major search engines and directories are measuring how often end users are clicking the links to your site and how long they are staying on your site and reading your web pages. They are also measuring how often end users return to your site. All these measurements constitute a site's click-through popularity.
The search engines and directories measure both link popularity (quality and quantity of links) and click-through popularity to determine the overall popularity component of a web site.
If a single page (web page 1) ranks well in the search engines and end users click the links to that web page and browse your site, web page 1's popularity level increases. If a different web page (web page 2) ranks well in the search engines for a different keyword phrase, web page 2's popularity level increases. The total page popularity of your site will increase your overall site's online visibility.
One of the reasons that a site's home page is more important than any other web page is that search engines assign a higher "weight" to it. In all likelihood, the home page is going to be the URL listed in the major directories, and the home page has more links to it from within the web site.
Figure 1.6 illustrates the popularity within a web site. Pages with more links pointing to them have a higher page popularity "weight."
Figure 1.6 How search engines measure web page popularity.
Figure 1.7 illustrates the popularity of a web site, which search engines do not always measure. Search engines measure a web page's popularity; a web site owner also will measure a web site's popularity. Sites with more links pointing to them have a higher site popularity "weight."
Figure 1.7 How web site owners measure web site popularity.
Because popularity consists of multiple subcomponents and these subcomponents are always fluctuating, the popularity measurement is dynamic and cumulative.
All search engine marketing campaigns should begin with the popularity component because all the major search engines measure popularity as a part of their search engine algorithms. The quickest way to achieve an initial, effective popularity component is to have your site listed in what search engines consider reliable sources: the major directories.
Web directories use human editors to create their listings. When you submit a site to be included in a directory, a human editor reviews your site and determines whether to include your site in the directory. Human editors also discover sites on their own through searching or browsing the web.
Every web page (or site) listed in a directory is categorized in some way. The categories are typically hierarchical in nature, branching off into different subcategories. Web searchers can find sites in directories by browsing categories, or they can perform a keyword search for information.
For example, a company that sells "organic teas" might be listed in this Yahoo! category: Business and Economy > Shopping and Services > Food and Drinks > Drinks > Tea > Organic. If we place the categories in a vertical hierarchy, it will look like this:
Business and Economy
Shopping and Services
Food and Drinks
In this example, the top-level category is called "Business and Economy." A subcategory of "Business and Economy" is "Shopping and Services." A subcategory of "Shopping and Services" is "Food and Drinks," and so on. As we move down (drill down) the category structure, notice that the categories get more and more specific.
A company that sells "herbal teas" might be listed in a different Yahoo! category: Business and Economy > Shopping and Services > Food and Drinks > Drinks > Tea > Herbal. Let's place this categorization into a vertical hierarchy:
Business and Economy
Shopping and Services
Food and Drinks
A company that sells a variety of teas might be listed in a less specific Yahoo! category:
Business and Economy
Shopping and Services
Food and Drinks
Directories are structured in this manner to make it easier for their end users to find sites.
Web pages are generally displayed in directories with a Title and a Description. The Title and Description originate either from the directory editors themselves (upon reviewing a site) or are adapted from site owner submissions. It is important to remember that directories do not necessarily use the HTML <title> tag or the description contained in your site's meta tags.
Because most web directories tend to be small, directory results are often supplemented with additional results from a search engine partner. These supplemental results are commonly referred to as fall-through results. In fact, many people mistakenly believe that their sites are listed in a directory when they are actually appearing in the fall-through results from a search engine.
Directories usually differentiate their directory listings and their fall-through listings. If you perform a keyword search on a directory, the directory results might appear under a heading titled "Web Site Matches" or "Reviewed Web Sites." Sites that are listed in directories generally have a category displayed with them.
When a web directory fails to return any results, fall-through results from a search engine partner are usually presented as the primary results. Fall-through results are typically labeled "Web Page Matches" or something similar.
One way you can tell if your site is listed in a directory is to perform a keyword search on your company name or URL. If you see a "Powered by Google" or "Powered by Inktomi" near your web site listing, then in all likelihood, your site is listed in the search engine fall-through results but not in the directory (see Figure 1.8).
Figure 1.8 The "Powered by Google" image indicates that the search results came from the search engine (Google), not the directory (Yahoo!).
Finally, directories tend to list web sites, not individual web pages. A web site is a collection of web pages that generally focuses on a specific topic. In other words, a web page is part of a web site. A directory is most likely to list only your domain name (http://www.companyname.com), not individual web pages. In contrast, search engines can list all individual web pages from an entire web site, not just a single home page.
If a particular web page (or set of pages) within a site contains unique, valuable information about a particular topic, that page can be listed in a different directory category. Glossaries and how-to tips are examples of content-rich sections of web sites that can receive additional directory listings.
Paid Submission Programs
A search engine or directory that uses a paid submission program charges a submission fee to process a request to be included in its index. Payment of the submission fee guarantees that your site will be reviewed within a specified period of time (generally 48 hours to 1 week).
If you want to have individual, content-rich web pages included in separate categories, in most cases, you must pay an additional submission fee for another review. Some directories accept content-rich pages without payment, but directory editors generally do not review these pages as quickly as the paid submissions.
The main advantage of paid submission is speed. You know your web site is being reviewed quickly, and, if the editors find your site acceptable, your site is added to the directory database quickly. Furthermore, after your site is added to the directory, the listing gives your site a significant popularity boost in the search engines. Yahoo! is an example of a directory that has a paid submission program.
How Directories Rank Web Sites
When you perform a keyword search in a directory, the search results are displayed in order of importance. Top directory listings are based on the following criteria:
The directory category
The web site's title
The web site's description
If the words you searched for appear in a category name, the category name appears at the top of a directory's search results. For example, if we searched for "organic teas" on Yahoo!, the category that has both the word "organic" and the word "tea" appears at the top of the search results, as shown in Figure 1.9.
Figure 1.9 Matching category results for the phrase "organic teas" in Yahoo!.
Immediately following the category listings are Sponsor Matches, which are pay-for-placement advertisements (see Figure 1.10).
Figure 1.10 Paid advertisements appearing in Yahoo! search results.
If the words in a search query do not appear in a directory category, the search results display sites that use these words in their titles and descriptions. Figure 1.11 shows the results of scrolling down the Yahoo! search results page.
Figure 1.11 Web site matches in Yahoo! for the phrase "organic teas."
Sites that have keywords in the category name, title (company name), and description are displayed at the top of the page. Figure 1.11 shows how Yahoo! provides access to some web sites that it feels are directly relevant to the search.
Sites that have keywords in their company name and description appear next, and sites that have only keywords in the description appear after that.
How Directory Editors Evaluate Web Sites
Directory editors look at a submitted web site to determine (a) whether unique, quality content is present on the web site, and (b) how this content is presented. Great content is the most important element of any web site, and that content needs to be delivered to your target audience in the most effective way possible. Figure 1.12 illustrates the directory submission process.
Figure 1.12 How directory editors evaluate a web site.
Directory editors are looking for particular characteristics before including a site in the directory. We discuss those characteristics next.
Directory editors do not want to place sites with identical information in the same category. Thus, before you submit your site to a directory, check out the other sites in your targeted category. Make sure your site contains unique information so that it will add value to that directory category.
You can point out any unique content to the directory editor using your description or the extra comments field in the submission form.
Most Appropriate Category
To select the most appropriate category (or categories) for your web site, type your selected keywords in the directory search box and study the results. If multiple categories appear, view many of the web sites listed under each category. Your site's actual content must accurately reflect the category or categories you wish to be listed under and be similar to the other sites listed in those categories.
You will probably be listed under the same categories your competitors are listed under, though it is important to understand that from directory editors' perspectives, your site belongs in a category that they deem appropriate, not necessarily in a category in which you believe your target audience is searching.
Editors want legitimate organizations and companies listed in their commercial categories. They do not want a small start-up company that will not be around next year. This would result in a dead link to a URL in the directory.
Having a virtual domain (http://www.companyname.com) is an indication that you are a legitimate organization or business. Having all your contact information (address, telephone number, fax number, and email address) readily available on your site is also an indication that you have a legitimate business. Directory editors will perform a WHOIS lookup (http://www.netsol.com/cgi-bin/whois/whois) to see if the information there matches the information you gave in your submission form.
If you have an e-commerce site, directory editors are looking for such items as secure credit card processing (for sites that accept credit cards), a return policy or a money-back guarantee (for sites the sell products), and a physical address, not a post office box.
The description you submit to directory editors should accurately reflect the content of your web site. Directory editors should be able to determine that the description is accurate just by viewing your home page.
For example, if you sell organic tea on your web site and you specialize in three types of tea (oolong, black, and green teas), those three specialties should be obvious to an editor just by his viewing your home page. Furthermore, if directory editors navigate your site or perform a search on a site search engine, they should easily be able to find the pages that show the items used in your description.
Part 2 of this book, "How to Build Better Web Pages," details how to write effective directory descriptions.