|
|
 |
Glossary of Terms
|
|
| Although we maintain the currency of our glossary, some of the terms and definitions presented below are subject to expiration or change due to the evolving nature of the industry. |
|
AdjacencyA property of the relationship between words in a search engine (or directory) query. Search engines often allow users to specify that words should be next to one another or somewhere near one another in the Web pages searched.
Agent Name DeliveryThe process of sending search engine spiders to a tailored page, yet directing your visitors to what you want them to see. This is done using server side includes (or other dynamic content techniques). SSI, for example, can be used to deliver different content to the client depending on the value of HTTP_USER_AGENT. Most normal browser software packages have a user agent string which starts with "Mozilla" (coined from Mosaic and Godzilla). Most search engine spiders have specific agent names, such as "Gulliver", "Infoseek sidewinder", "Lycos spider" and "Scooter".
By switching on the value of HTTP_USER_AGENT (a process known as agent detection), different pages can be presented at the same URL, so that normal visitors will never see the page submitted to search engines (and vice versa).
In practice this is somewhat simplistic. Some search engines pretend to be "plain Mozilla" browsers to prevent use of agent name delivery. Effective use of agent name delivery can be very difficult, and may not even work.
How do you spot agent name delivery at work? This is quite difficult, as the owners of Web pages using agent name delivery can control what you see! You may be able to guess that a page is using this technique if it appears to be indexed incorrectly or the title or description don't match the page you see, but this could also have been achieved by switching pages after the relevant search engine has indexed it. If you really want to see the search engines' tailored version of a page, write a program (e.g. a Perl script) to retrieve the URL with HTTP_USER_AGENT set to each of the strings used by the search engine spiders. If agent name delivery is in use, one or more of the retrieved pages will be different to the others.
See also hidden text and Bait-and-Switch and IP delivery.
AltaVistaAltaVista is a popular search engine with one of the largest databases on the Web, indexing more than 140 million pages. Its main URL is http://www.altavista.com. Until 1998, this search engine provided the search facility for Yahoo. AltaVista indexes all the words in a Web page, and new pages are normally added to the database fairly quickly, within 72 hours. You are asked to submit just the main page of your site. The AltaVista spider will then explore your site and index a representative sample of the pages. Some problems with spamming have been noticed. The use of keyword Meta tags is penalized. AltaVista places various alternative options before its search results, including suggested questions (using the Ask Jives service), Real Names. Paid entries are beginning to appear at the start of the search results.
AOL NetFindThe default search engine for users of the AOL Internet service provider, and hence a busy site. Its URL is http://www.netfind.com. It is essentially the same engine as Excite.
AppletA small program, often written in Java, which usually runs in a Web browser, as part of a Web page. It is possible that the use of such a program may cause spiders and robots to stop indexing a page.
ArchitextSpiderThe name of the Excite search engine's spider.
Ask JeevesAn advanced Meta search engine that can be asked questions in English. This service is also in use at AltaVista. http://www.askjeeves.com.
Bait-and-SwitchThe provision of one page for a search engine or directory and a different page for other user agents at the same URL. Methods that exist: Agent Name Delivery; IP Delivery.
Boolean searchA search allowing the inclusion or exclusion of documents containing certain words through the use of operators such as AND, NOT and OR.
Bridge PageSee Gateway Page.
CGICommon Gateway Interface - a standard interface between Web server software and other programs running on the same machine.
CGI ProgramStrictly, any program which handles its input and output data according to the CGI standard. In practice, CGI programs are used to handle forms and database queries on Web pages, and to produce non-static Web page content.
Channels, Channel listingsLists of links to selected (and usually popular) Web sites. The links are maintained by search engines and directories and are sorted into categories or channels. Sites are picked by a channel editor, often because of the site's already high-ranking status with the search engines. Some search engines and directories allow visitors to nominate sites for inclusion in their channels.
ClientA computer, program or process that makes requests for information from another computer, program or process. Web browsers are client programs. Search engine spiders are (or can be said to behave as) clients.
Click throughThe process of clicking on a link in a search engine output page to visit an indexed site. This is an important link in the process of receiving visitors to a site via search engines. Good ranking may be useless if visitors do not click on the link that leads to the indexed site. The secret here is to provide a good descriptive title and an accurate and interesting description.
CloakingThe hiding of page content. Normally implemented to prevent page-thieves from copying (stealing) optimized pages. See also Bait-and-Switch.
ClusteringThe listing of only one page from each Web site in a search engine or directory's list of search results. This avoids occupation of all the top results by a small number of Web sites and makes the list of results clearer and more useful to the user.
CommentThe HTML <-- and --> tags are used to hide text from browsers. Some search engines ignore text between these symbols, but others may index such text as if the comment tags were not there. Comments are often used to hide JavaScript code from non-compliant browsers, and sometimes (notably on Excite) to provide invisible keywords to some search engines.
Concept searchA search for documents related conceptually to a word, rather than specifically containing the word itself.
CrawlerSee Spider.
Dead LinkAn Internet link that does not lead to a page or site, likely due to the server being down, or the page has moved or no longer exists. Most search engines have techniques for removing such pages from their listings automatically, but as the Internet continues to increase in size, it becomes more and more difficult for a search engine to check all the pages in the index regularly. Reporting of dead links helps to keep the indexes clean and accurate, and this can usually be done by submitting the dead link to the search engine.
De-listingThe removal of pages from a search engine's index. Removal can occur for various reasons, including unreliability of the machine that hosts a site or because of perceived attempts at spamdexing, or irrelevant content representing the keywords used to submitted under.
DescriptionDescriptive text associated with a Web page and displayed, usually with the page title and URL, when the page appears in a list of pages generated by a search engine or directory as a result of a query. Some search engines take this description from the DESCRIPTION Meta tag - others generate their own from the text in the page. Directories often use text provided at registration.
Direct HitA system that monitors the search engine users' selections from search engine results, counting which results are clicked on most, and how long visitors spend at that site, so as to improve relevancy. Used by HotBot and as a plug-in to Apple's innovative Sherlock search system. See www.directhit.com.
DirectoryA server or a collection of servers dedicated to indexing Internet Web pages and returning lists of pages that match particular queries. Directories (also known as Indexes) are normally compiled manually, by user submission (such as at whatsnew.com), and often involve an editorial selection and/or categorization process (such as at LookSmart and Yahoo).
DogpileA Meta search engine. Found at http://www.dogpile.com.
DomainA sub-set of Internet addresses. Domains are hierarchical, and lower-level domains often refer to particular Web sites within a top-level domain. The most significant part of the address comes at the end - typical top-level domains are .com, .edu, .gov, .org (which sub-divide addresses into areas of use). There are also various geographic top-level domains (e.g. .ar, .ca, .fr, .ro etc.) referring to particular countries.
The relevance to search engine terminology is that Web sites which have their own domain name (e.g. http://www.thinkbuilddeploy.com) will often achieve better positioning than Web sites that exist as a sub-directory of another organization's domain (e.g. http://subdomain.earthlink.com/home pages/thinkbuilddeploy/).
Doorway PageSee Gateway Page.
Dynamic contentInformation on Web pages that changes or is changed automatically, e.g. based on database content or user information. Sometimes it's possible to spot that this technique is being used, e.g. if the URL ends with .asp, .cfm, .cgi or .shtml. Though it is possible to serve dynamic content using standard (normally static) .htm or .html type pages. Search engines will currently index dynamic content in a similar fashion to static content, although they will not usually index URLs which contain the "?" character, found in dynamic server pages.
Entry PageSee Gateway Page.
EuroseekA search engine that concentrates on information relating to Europe. The URL is http://www.euroseek.com.
ExciteRegarded as one of the highest quality search engines, with an index of 55 million pages. It can be slow to index new sites. The URL is http://www.excite.com. Sites using frames must have a NOFRAMES section in order to be listed. Some spamming has been noticed. Excite previously ignored the DESCRIPTION Meta tag, but is now using this in its listings (although the contents do not affect relevancy, which is based mainly on the title and body text). The use of gateway pages and hidden text is allowed. Excite has an audio/video search facility which is a branded component of RealNetworks' RealPlayer.
Fake Copy ListingsSometimes a malicious company will steal a Web page or the entire contents of a Web site, re-publish at a different URL and register with one or more search engines. This can cause a loss of traffic from the original site if the search engines position the copy higher in the listings. If you find that someone has stolen your site in this way, write to the company concerned and ask them to remove the stolen content. Also contact the hosting service used by the company, any company that benefits from the theft and any search engine(s) concerned. If the thieves refuse to remove the material or ignore you, obtain legal advice. It is also well worth having printed evidence to support your claim that your copy of the material was there first, and that you have the copyright! See also Mirror Sites.
False DropA Web page retrieved from a search engine or directory that is not relevant to the query used. This could be for one of the following reasons: (1) The Web page contained the keywords entered, but used in the wrong context, with a different meaning or with a different inter-relationship than expected, (2) the Web page is an attempt at spamdexing, or (3) the search engine has a fault in its database or a bug in its query program.
Flash PageSee Splash Page.
Font and Background SpoofsVarious techniques used to place invisible text in a Web page, to improve positioning without affecting the appearance of the page. These are mostly based on setting the font and background colors to the same value (e.g. white). Most search engines now detect these tricks. Another not-so-common tactic is to create a table with a background color of say, "red", that references a white ".GIF" background image containing white text on top of it; the search engine disallows text colors of the same background color to minimize spamming, however, the engine sees a table of red and a text color of white and spiders the text accordingly. In this example, the users would not see the text unless they "ctrl + A" (select all) the page.
FramesAn HTML technique for combining two or more separate HTML documents within a single Web browser screen. Compound interacting documents can be created to present Web pages in multiple windows or sub-windows.
A framed Web site often causes great problems for search engines, and may not be indexed correctly. Search engines will often index only the part of a framed site within the <NOFRAMES> section, so make sure that the <NOFRAMES> section includes relevant text that can be indexed by the spiders. If your site uses frames, consider providing a gateway page or adding navigational links within the framed pages. Submit the main page - the one containing the <FRAMESET> tag to the search engines. If you use a gateway page, submit this separately.
Full-text indexAn index containing every word of every document cataloged, including stop words (defined below).
Fuzzy searchA search that will find matches even when words are only partially spelled or misspelled.
Gateway PageA Web page submitted to a search engine (spider) to give the relevance-algorithm of that particular spider the data it needs, in the format that it needs it, in order to place a site at the proper level of relevance for the topic(s) in question. (This determination of topical relevance is called "placement".)
A gateway page may present information to the spider, but obscure it from a casual human viewer. The gateway page exists so as to allow a Web site to present one face to the spider, and another to human viewers. There are several reasons why one might want to do this. One, is that the author may not want to publicly disclose placement tactics. Another is that the format easiest for a given spider to understand may not be the format that the author wishes to present to his viewers for aesthetics purposes. Yet another circumstance may be the format that is best suited for one spider differs from that which is best for another. By using gateway pages, you can present your site to each spider in the way that is known or thought to be best for that particular spider.
Also known as bridge pages, doorway page, entry pages, portals or portal pages.
Go.comA portal partnership between Infoseek and Disney, with search capabilities based on the Infoseek index, at http://go.com/.
GooglePresently one of the most highly leveraged search databases available, feeding a majority of smaller search sites and directories such as Yahoo.
GoToA search engine, powered by Inktomi, which only returns one URL per domain in its search results. Operates as "pay per click" model where Web sites can pay to increase their relevancy. The URL is http://www.goto.com.
GulliverThe name of the Northern Light Search Engine's spider.
HeadingMany search engines give extra weight and importance to the text found inside HTML heading sections. It is generally considered good advice to use headings when designing Web pages and to place keywords inside headings.
Hidden TextText on a Web page that is visible to search engine spiders but not visible to human visitors. This is sometimes because the text has been set the same color as the background, because multiple TITLE tags have been used or because the text is an HTML comment. Hidden text is often used for spamdexing. Many search engines can now detect the use of hidden text, and often remove offending pages from their database or lower such pages' positioning.
Text can also be hidden using agent name delivery or IP delivery either to present different text to different search engine spiders or to hide the real HTML source from competitors.
HitIn the context of visitors to Web pages, a hit (or site hit) is a single access request made to the server for either a text file or a graphic. If, for example, a Web page contains ten buttons constructed from separate images, a single visit from someone using a Web browser with graphics switched on (a "page view") will involve eleven hits on the server. (Often the accesses will not get as far as ones server because a local Internet service provider will have cached the page).
In the context of a search engine query, a hit is a measure of the number of Web pages matching a query returned by a search engine or directory.
HotbotOne of the largest search engines, indexing 110 million pages. Powered by Inktomi, new submissions appear to be taking two weeks or longer to appear. The URL is http://www.hotbot.com.
HTMLHyperText Markup Language. The (main) language used to write Web pages. The browser translates the source code into a visual layout experience for the user.
HTTPHyperText Transfer Protocol. The (main) protocol used to communicate between Web servers and Web browsers (clients).
Image MapA set of hyperlinks attached to areas of an image. This may be defined within a Web page, or as an external file. If the image map is defined as an external file, search engines may have problems indexing your other pages, unless you duplicate the links as conventional text hyperlinks. If the image map is included within the Web page, the search engines should have no problem following the links, although it's good practice to provide text links too, to aid the visually impaired and those accessing the Web with graphics switched off or using text-only browsers.
Inbound LinkA hypertext link to a particular page from elsewhere, bringing traffic to that page. Inbound links are counted to produce a measure of the page popularity. Searches for the inbound links to a page can be made on AltaVista, Infoseek and Hotbot.
IndexThe searchable catalog of documents created by search engine software. Also called "catalog." Index is often used as a synonym for search engine. Index is commonly pluralized as "indices." However, Search Engine Watch instead uses the alternative plural form "indexes." See Directory.
InfindA Meta search engine. Found at http://www.infind.com.
InfoseekOne of the largest search engines. New sites are normally added very quickly, within one or two business days. The URL is http://www.infoseek.com. Infoseek is one of the few search engines to treat singular and plural forms as the same word. Very sensitive to page popularity in its positioning algorithm.
InktomiThe database used by some of the largest search engines, including Hotbot. Inktomi is also used by Yahoo when no matches are found in Yahoo's own database.
IP DeliverySimilar to agent name delivery, this technique presents different content depending on the IP address of the client. It is very difficult to view pages hidden using this technique, because the real page is only visible if your IP address is the same as (for example) a search engine's spider.
JavaA computer programming language whose programs can run on a number of different types of computers and/or operating systems. Used extensively to produce applets for Web pages.
JavaScriptAn interpreted computer language used for functional programming tasks within HTML Web pages. The scripts are normally interpreted (or run) on the client computer by the Web browser. Some search engines have been known to index these scripts, presumably erroneously.
KeywordA word that forms (part of) a search engine query.
Keyword DensityA percentage measure of how many times a keyword is repeated within text of a page. For example, if a page contains 100 words and ten of those words are "house", then "house" is said to have a 10% keyword density. There are programs that will rate keyword density by singular words or by groups of words, "new home for sale".
Keyword Domain NameThe use of keywords as part of the URL to a Web site. Positioning is improved on some search engines when keywords are reinforced in the URL.
Keyword PhraseA phrase that forms (part of) a search engine query.
Keyword PurchasingThe buying of search keywords from search engines, usually to control banner ad placement. Most major search engines insist that keyword purchasing is only used for banner ad placement, and doesn't influence search results.
Keyword searchA search for documents containing one or more words that are specified by a user.
Keyword StuffingThe repeating of keywords and keyword phrases in META tags or elsewhere.
Link PopularitySee page popularity.
Log FileA file maintained on a server in which details of all file accesses are stored. Analyzing log files can be a powerful way to find out about a Web site's visitors, where they come from and which queries are used to access a site.
LookSmartA medium-sized search directory. The URL is http://www.looksmart.com.
LycosOne of the largest search engines, Lycos appears to be moving towards becoming a directory and is using the Open Directory Project for some search results. It can be slow to index new sites. The Lycos spider ignores Meta tags in pages. Lycos can be found at http://www.lycos.com.
MetacrawlerA Meta search engine found at http://www.Metacrawler.com. Results from various search engines are summarized in an easy to read form.
MetafindA Meta search engine found at http://www.Metafind.com.
Meta SearchA "search of searches". A query is submitted to more than one search engine or directory, and results are reported from all the engines, possibly after removal of duplicates and sorting. Also the Meta search engine of the same name, found at http://www.Metasearch.com.
Meta Search EngineA server that passes queries on to many search engines and/or directories and then summarizes all the results. Ask Jeeves, Dogpile, Infind, Metacrawler, Metafind and Metasearch are examples of Meta search engines.
Meta tagA construct placed in the HTML header of a Web page, providing information that is not visible to browsers. The most common Meta tags (and those most relevant to search engines) are KEYWORDS and DESCRIPTION.
The KEYWORDS tag allows the author to emphasize the importance of certain words and phrases used within the page. Some search engines will respond to this information - others will ignore it.
The DESCRIPTION tag allows the author to control the text of the summary displayed when the page appears in the results of a search. Again, some search engines will ignore this information.
The HTTP-EQUIV Meta tag is used to issue HTTP commands, and is frequently used with the REFRESH tag to refresh page content after a given number of seconds. Gateway pages sometimes use this technique to force browsers to a different page or site. Most search engines are wise to this, and will index the final page and/or reduce the ranking. Infoseek has a strong policy against this technique, and they might penalize your site, or even ban it altogether.
Other common Meta tags are GENERATOR (usually advertising the software used to generate the page) and AUTHOR (used to credit the author of the page, and often containing an e-mail address, home page URL and other information).
Mining CompanyA large directory spread over many different URLs. Former name of About.com.
Mirror SitesMultiple copies of Web sites or Web pages, often on different servers. The process of registering these multiple copies with search engines is often treated as spamdexing, because it artificially increases the relevancy of the pages. Filters such as the Infoseek Sniffer now remove multiple mirrors from the indexes.
MisspellingsPeople quite often spell words incorrectly when using search engines. Pages that use common misspellings will quite often receive extra hits, so it is a useful technique to include common misspellings of words in alt tags, keywords, page names and titles. A similar effect occurs when spaces are missed out and words are accidentally joined together.
MultiCrawlA parallel search engine that offers users their own branded versions. http://www.multicrawl.com.
Multiple Domain NamesThe use of several extra domains to provide gateway pages or gateway sites to the main site.
Multiple Keyword TagsThe use of more than one Keywords META tag in order to try to increase the relevancy of the best keywords on a page. This is not recommended. It may be detected as a spamming technique, or all but one of the tags may simply be ignored.
Multiple TitlesIt used to be possible to repeat the HTML title tag in the header section of a page several times to improve search engine positioning. Most search engines now detect this trick and treat it as spamdexing.
NetfindSee AOL Netfind.
NewHooSee the Open Directory Project.
Northern LightA search engine with an additional "pay to access" special collection of business, health and consumer publication articles. The first search engine to ban Meta search engines from its database. The URL is http://www.northernlight.com.
Open Directory ProjectThe Open Directory Project (ODP) is a site directory run by volunteer editors. This is one of the great Internet success stories of 1999. The ODP is used by Lycos, Hotbot, AOL-Netfind, Netscape Netcenter, and the home base www.dmoz.org itself. Currently there are around 700,000 hand picked and selected sites in the directory. The first edition of the ODP was known as NewHoo (a play on Yahoo). Netscape provided server space for the NewHoo directory and it was collectively renamed The ODP. The URL is http://directory.mozilla.org.
Open TextA large business-only directory. The URL is http://www.opentext.com.
OptimizationChanges made to a Web page to improve the positioning of that page with one or more search engines. A means of helping potential customers or visitors to find a Web site. Optimization may involve design/layout changes, new text for the title-tags, Meta-tags, alt-attributes, headings, page content such as keyword density, and more.
Page PopularityA measure of the number and quality of links to a particular page (inbound links). Many search engines (and most noticeably Infoseek) are increasingly using this number as part of the positioning process. The number and quality of inbound links is equally important to the optimization of page content.
Page ViewUsed in site statistics as a measure of pages viewed rather than server hits. Many server hits may be made to access a single page, causing many separate log file entries. Analysis software can determine that these server hits were generated when a visitor viewed a single page, and group them together to provide this more useful method of counting visitors. See also Hit and Unique Visitor.
Phrase searchA search for documents containing an exact sentence or phrase specified by a user.
PlacementSee Positioning.
Politeness WindowIn order not to overburden any particular server, most search engine spiders limit their access to each server. If your page is hosted on the same server as thousands of other pages, the spider may never get the time to reach (and index) your page. This can be a powerful argument for having your own dedicated server.
PortalSee Gateway page. Can also mean Portal Site.
Portal PageSee Gateway page.
Portal SiteA generic term for any site that provides an entry point to the Internet for a significant number of users. Examples are search engines, directories, built-in default browser or service provider home pages, sites hardwired to browser buttons, sites offering free home pages, e-mail or personalized news and any popular (or heavily advertised) sites that a significant number of people may bookmark or set as default pages.
PositioningThe process of ordering Web sites or Web pages by a search engine or a directory so that the most relevant sites appear first in the search results for a particular query.
Positioning TechniqueA method of modifying a Web page so that search engines (or a particular search engine) treat the page as more relevant to a particular query (or a set of queries).
PrecisionThe degree in which a search engine lists documents matching a query. The more matching documents that are listed, the higher the precision. For example, if a search engine lists 80 documents found to match a query but only 20 of them contain the search words, then the precision would be 25%.
Proximity searchA search where users specify that documents returned should have the keywords near each other.
QueryA word, a phrase or a group of words, possibly combined with other syntax used to pass instructions to a search engine or a directory in order to locate Web pages.
Query-By-ExampleA search where a user instructs an engine to find more documents that is similar to a particular document. Also called "find similar."
RankingSee Positioning.
RealNamesAn alternate Web site address system in operation at AltaVista. Brand names used in searches are mapped directly to the appropriate Web site, usually because the company owning the brand-name has paid a fee to RealNames. http://www.realnames.com
RecallRelated to precision, this is the degree in which a search engine returns all the matching documents in a collection. There may be 100 matching documents, but a search engine may only find 80 of them. It would then list these 80 and have a recall of 80%.
ReferrerThe URL of the Web page from which a visitor came. The server's referrer log file will indicate this. If a visitor came directly from a search engine listing, the query used to find the page will usually be encoded in the referrer URL, making it easy to see which keywords are bringing visitors. The referrer information can also be accessed as document.referrer within JavaScript or via the HTTP_REFERER environment variable (accessible from scripting languages).
Refresh TagSee the paragraph about HTTP_EQUIV under Meta Tag.
RegistrationThe process of informing a search engine or directory that a new Web page or Web site should be indexed.
Relevancy AlgorithmThe method a search engine or directory uses to match the keywords in a query with the content of each Web page, so that the Web pages found can be ordered suitably in the query results. Each search engine or directory is likely to use a different algorithm, and to change or improve its algorithm from time to time.
Re-submissionRepeating the search engine registration process one or more times for the same page or site. Under certain circumstances, this is regarded with suspicion by the search engines, as it could indicate that someone is experimenting with spamming techniques.
The Infoseek and AltaVista search engines are particularly vulnerable to spamming because they list sites very quickly, and are thus easy to experiment with. Both engines de-list sites for repeated re-submission and Infoseek, for example, does not allow more than one submission of the same page in a 24 hour period. Occasional re-submission of changed pages is not normally a problem.
RobotAny browser program which follows hypertext links and accesses Web pages but is not directly under human control. Examples are the search engine spiders, the "harvesting" programs which extract e-mail addresses and other data from Web pages and various intelligent Web searching programs. A database of Web robots is maintained by Webcrawler.
robots.txtA text file stored in the top-level directory of a Web site to deny access by robots to certain pages or sub-directories of the site. Only robots that comply with the Robots Exclusion Standard will read and obey the commands in this file. Robots will read this file on each visit, so that pages or areas of sites can be made public or private at any time by changing the content of robots.txt before re-submitting to the search engines. The simple example below attempts to prevent all robots from visiting the /secret directory:
User-agent: * Disallow: /secret
ScooterThe name of the AltaVista search engine's spider.
Search EngineA server or a collection of servers dedicated to indexing Internet Web pages, storing the results and returning lists of pages that match particular queries. The indexes are normally generated using spiders. Some of the major search engines are Google, AltaVista, Excite, Hotbot, Infoseek, Lycos, Northern Light and Webcrawler. Note that Yahoo and Looksmart are directories, not search engines. However, the term Search Engine is often used to describe both directories and search engines.
SearchkingA smaller search engine that allows visitors to vote on the relevance of the pages returned by their queries, thus ranking sites based on the opinions of searchers. Unlike some of the major search engines. http://www.searchking.com.
Search TermSee Query.
ServerA computer, program or process that responds to requests for information from a client. On the Internet, all Web pages are held on servers. This includes those parts of the search engines and directories which are accessible from the Internet.
SidewinderThe name of the Infoseek search engine's spider.
SiphoningThe use of various means to steal another site's traffic. Techniques used include the wholesale copying of Web pages (with the copied page altered slightly to direct visitors to a different site, and then registered with the search engines) and the use of keywords or keyword phrases "belonging" to other organizations, companies or Web sites.
Site HitSee hit.
SkewingArtificially changing search engine results so that, for example, popular queries will return artificially created listings. Infoseek is currently experimenting with this technique, using a small group of reviewers to artificially force higher relevance for certain sites.
SlurpThe name of the spider used by Inktomi.
Snap!A large directory. The URL is http://www.snap.com.
SnifferThe name of the filter program used by the Infoseek search engine to prevent spamdexing. It detects multiple mirror pages, font and background spoofs, multiple title tags, keyword stuffing and possibly other types of spamdexing.
SpamdexingThe alteration or creation of a document with intent to deceive an electronic catalog or filing system. Any technique that increases the potential position of a site at the expense of the quality of the search engine's database and terms and conditions can also be regarded as spamdexing. Also known as spamming or spoofing.
SpammingSee spamdexing. Spamming is also used more generally to refer to the sending of unsolicited bulk electronic mail, and the search engine use is derived from this term.
SpiderThe software that scans documents and adds them to an index by following links. Spider is often used as a synonym for search engine. Each search engine has their own unique spider algorithm(s), making it difficult to comply with one set of rules or standards for optimizing Web site pages and content. See also Robot.
SpideringThe automated process of surfing the Web, storing URLs and indexing keywords, links and text.
Typically, even the largest search engines cannot spider all of the pages on the net. This is due to the huge amount of data available, the speed at which the new data appears, the use of politeness windows and practical limits on the number of pages that can be visited in a given time. The search engines have to make compromises in order to visit as many sites as possible, and they do this in different ways. For example, some only index the home pages of each site, some only visit sites they're explicitly told about, and some make judgments about the importance of sites (from number and quality of inbound links) before "digging deeper" into the sub-pages of a site.
Splash pageSimilar to a gateway page but provides an initial display that must be viewed before a visitor reaches the main page. This usually acts as a kind of "opening title" sequence, and may be cumbersome to the user experience - hence contrary to the overall purpose.
SpoofingSee spamdexing.
SSIServer Side Includes. Used (for example) to add dynamically generated content to a Web page.
Stealth ScriptA CGI script that switches page content depending on who or what is accessing the page. See agent name delivery.
StemmingA function of some search engines and directories which allows results to be returned from some or all keywords based on the same stem as the keyword entered as a search term. For example, when stemming is switched on, a search for the word "dance" will return matches for any word whose stem is "danc-", matching the keywords dance, dancer and dancing.
Stop WordA word which is ignored in a query because the word is so commonly used that it makes no contribution to relevancy. Examples are common net words such as computer and Web, and general words like get, I, me, the, and you.
TitleThe text contained between the start and end HTML tags of the same name. This text is associated with (but not displayed in) the Web page containing these tags, and is displayed in a special position (usually at the top of the window) by the Web browser.
Title text is important because it normally forms the link to the page from the search engine listings, and because the search engines pay special attention to the title text when indexing the page.
Don't confuse title text with heading text within the Web page, which often looks like the title. Usually this will be rendered either using the HTML heading tags or with the use of a large font size.
ThesaurusA list of synonyms a search engine can use to find matches for particular words if the words themselves don't appear in documents.
TrafficThe visitors to a Web page or Web site. Also refers to the number of visitors, hits, accesses etc. over a given period.
Unique VisitorA real visitor to a Web site. Web servers record the IP addresses of each visitor, and this is used to determine the number of real people who have visited a Web site. If for example, someone visits twenty pages within a Web site, the server will count only one unique visitor because the page accesses are all associated with the same IP address.
See also hit and page view.
URLUniversal Resource Locator. An address that can specify any Internet resource uniquely. The beginning of the address indicates the type of resource, e.g. http: for Web pages, ftp: for file transfers, telnet: for computer login sessions or mailto: for e-mail addresses.
URL SubmissionSee Registration.
Virtual DomainA domain hosted by a virtual server account.
Virtual ServerAn account on a hosting company's server usually linked to its own domain. This provides an inexpensive way to run a Web site with its own top level domain, and is usually indistinguishable from having a separate physical server, except that the virtual server may share an IP address with other virtual servers on the same machine. A virtual server account is fine for most uses, but will often be slower to respond than a physically separate server, and physical access to the machine will seldom be allowed. The cost of a virtual server account is a small fraction of that needed to run a real server, mainly because of the expense of the dedicated line needed to connect the server continuously to the rest of the net.
Web CopywritingEffective content writing is an essential asset to search engine optimization. There are many factors to writing for the Web, including keyword density; repeating a keyword too many times can trigger a spam filter, however, insufficient keyword density could cost your positioning to suffer.
WebCrawlerOne of the largest search engines. The URL is http://www.Webcrawler.com.
XMLExtensible Markup Language. A language that supports more efficient data delivery over the Web. NOTE: XML does nothing itself - it must be implemented using 'parser' software or XSL.
XSLExtensible Scripting Language. An XML style sheet language supported by the latest Web browsers.
|
|
| Featured Highlights |
| | Strategic campaigns tailored to your business |
| | Achieve top rankings in major engines |
| | Cost-conscious ROI products |
| | Seasoned SEO professionals |
| | Market analysis and strategy |
| |
|