| Introduction Major
Search Engines How
They Work How They Rank How
We Submit 
|
How Search Engines Rank Webpages
Search for anything using
your favorite crawler-based search engine. Nearly instantly, the search engine
will sort through the millions of pages it knows about and present you with ones
that match your topic. The matches will even be ranked, so that the most relevant
ones come first. Of course, the search engines
don't always get it right. Non-relevant pages make it through, and sometimes it
may take a little more digging to find what you are looking for. But, by and large,
search engines do an amazing job. As WebCrawler founder
Brian Pinkerton puts it, "Imagine walking up to a librarian and saying, travel.
Theyre going to look at you with a blank face." OK
-- a librarian's not really going to stare at you with a vacant expression. Instead,
they're going to ask you questions to better understand what you are looking for. Unfortunately,
search engines don't have the ability to ask a few questions to focus your search,
as a librarian can. They also can't rely on judgment and past experience to rank
web pages, in the way humans can. So, how do crawler-based
search engines go about determining relevancy, when confronted with hundreds of
millions of web pages to sort through? They follow a set of rules, known as an
algorithm. Exactly how a particular search engine's algorithm works is a closely-kept
trade secret. However, all major search engines follow the general rules below. Location,
Location, Location...and Frequency One of the the main
rules in a ranking algorithm involves the location and frequency of keywords on
a web page. Call it the location/frequency method, for short. Remember
the librarian mentioned above? They need to find books to match your request of
"travel," so it makes sense that they first look at books with travel
in the title. Search engines operate the same way. Pages with the search terms
appearing in the HTML title tag are often assumed to be more relevant than others
to the topic. Search engines will also check to see if
the search keywords appear near the top of a web page, such as in the headline
or in the first few paragraphs of text. They assume that any page relevant to
the topic will mention those words right from the beginning. Frequency
is the other major factor in how search engines determine relevancy. A search
engine will analyze how often keywords appear in relation to other words in a
web page. Those with a higher frequency are often deemed more relevant than other
web pages.
| |
Spice In The Recipe
Now it's time to qualify the location/frequency method
described above. All the major search engines follow it to some degree, in the
same way cooks may follow a standard chili recipe. But cooks like to add their
own secret ingredients. In the same way, search engines add spice to the location/frequency
method. Nobody does it exactly the same, which is one reason why the same search
on different search engines produces different results. To
begin with, some search engines index more web pages than others. Some search
engines also index web pages more often than others. The result is that no search
engine has the exact same collection of web pages to search through. That naturally
produces differences, when comparing their results. Meta
tags are what many web designers mistakenly assume are the "secret"
to propelling their web pages to the top of the rankings. However, not all search
engines read meta tags. In addition, those that do read meta tags may chose to
weight them differently. Overall, meta tags can be part of the ranking recipe,
but they are not necessarily the secret ingredient. Search
engines may also penalize pages or exclude them from the index, if they detect
search engine "spamming." An example is when a word is repeated hundreds
of times on a page, to increase the frequency and propel the page higher in the
listings. Search engines watch for common spamming methods in a variety of ways,
including following up on complaints from their users. Off
The Page Factors Crawler-based search engines have plenty
of experience now with webmasters who constantly rewrite their web pages in an
attempt to gain better rankings. Some sophisticated webmasters may even go to
great lengths to "reverse engineer" the location/frequency systems used
by a particular search engine. Because of this, all major search engines now also
make use of "off the page" ranking criteria. Off
the page factors are those that a webmasters cannot easily influence. Chief among
these is link analysis. By analyzing how pages link to each other, a search engine
can both determine what a page is about and whether that page is deemed to be
"important" and thus deserving of a ranking boost. In addition, sophisticated
techniques are used to screen out attempts by webmasters to build "artificial"
links designed to boost their rankings. Another off the
page factor is clickthrough measurement. In short, this means that a search engine
may watch what results someone selects for a particular search, then eventually
drop high-ranking pages that aren't attracting clicks, while promoting lower-ranking
pages that do pull in visitors. As with link analysis, systems are used to compensate
for artificial links generated by eager webmasters. |