// google // // google end //

Google supplemental index

August 21, 2009 by  
Filed under Google

The supplemental index is a secondary index for lower ranking pages and doesn’t contribute anything to your sites visibility on Google. Pages found in the supplemental index tend to be crawled less often and will never be assigned PageRank. As a result, these pages tend to appear lower in organic search results. There are many reasons why pages lose rank and fall into the supplemental index. Here are the most common:

  • Low quality content or mainly images (1 line posts).
  • Duplicate content from other sites. Beware of blogs as they create duplicate pages for archives etc.
  • Identical page title or description.
  • Orphaned pages – no or poor ranked inbound links.
  • Lack of external links.
  • Old and stale content.
  • Excessive reciprocal links, linking to spammy neighbourhoods.
  • The number of query string parameters exceeds Google’s algorithm.
  • Too many pages, publish fewer but meatier pages.
  • Cannonical issues causing PageRank to split (see below).

Doing a search in Google gives results from its “normal” index first, and then results from its supplemental index near the end. Sometimes, if not enough pages are available from the normal index, Google will include results from the supplemental results.

 

Calculating your supplemental index ratio

The The best way to calculate how many of your web pages is in the supplemental index is to work out the ratio as follows :-

Total Pages Indexed = site:www.yoursite.com
Pages in the Main Index = site:www.yoursite.com -inallurl:www.yoursite.com
Pages in Supplemental Index = Total Pages Indexed – Pages in the Main Index
Supplemental Ratio = Divide supplemental index by total pages indexed X 100

 

How to improve your supplemental index ratio

  • Good quality relevant inbound links.
  • Use a robots.txt file to stop search engines spidering irrelevant areas of your site.
  • Remove or rewrite duplicate content (duplicate to your site and others (i.e. cut and pasted text from other sites).
  • Create a Google sitemap.
  • Lean and focused pages with rich content.
  • Do not over seed your site with keywords.
  • Unique quality meta description tags on each page. Otherwise Google will use the top text of your pages content and that may be navigation.
  • Create original, compelling content, marketing it, and getting links you deserve.

 

Cannonical issues are causing PageRank to split

The sum of all inbound PageRank links to your site, is split between the total pages in your site. Thus a bigger page count, means lower average PageRank per page (depending on the site structure). A page with a low PageRank (minimum threshold ) ends up in the supplemental index. By reducing the number of pages you increase the average PageRank per url. Resulting in supplemental pages going back into the main index.

Google search tips

August 21, 2009 by  
Filed under Google

A bunch of useful google tips to improve your searches and allow you to do a lot more than just basic searching with google.
Reference : The google story by David A.Vise

(1) Google can be your phone book – Type a buisness name, city and state to get a number or type the number to reverse search it.

(2) A Calculator – Type in a maths problem to compute it. words or numbers

(3) More search words the better – To narrow your search

(4) Use quotation marks for precision searches – ideal for exact search occurances.

(5) Google dictionary – Type define followed by the english word.

(6) Capitalization doesnt matter – Google ignores words like the, and , is, of.

(7) Forget Pluralism – Typing dances, dances or dancing returns the same results.

(8) Get the picture – To search for images click on the images link first

(9)Browse bookshelves online – Search for a topic at print.google.com and you will see information from actual books scanned and indexed by google

(10) Dial googl when youre on the go – Get phone numbers, directions, movie times, stock quotes and more delivered to youre mobile. Send a text message with the query to 46645 and you will be text back answers

(11) Weatherman – Type weather followed by your zip code or a city name to get a weather forcast.

(12) Get a stock quote – Type a stock ticker symbol into the search box

(13) Translate into other languages – Hit the language tools link on the right side of the search box

(14) Take a magic ~ ride – Type in a tilde before a word and google will search both the term and synonyms

(15) PG rated results – Select preferences and change the safesearch settings

(16) More results per page – Adjusted under the preferences option

(17) Google search guide – google.com/help/cheatsheet.html

(18) Newscaster – Reachable via the news link or news.google.com

(19) Become a scholar – Tap into thousands of acedemic journals at scholar.google.com

(20) Focus your search – restrict searches to a specific domain, site or page by putting site,domain after your search phrase ie. pirates site,disney.com only return “pirates” from disney sites. To eliminate specific pages from a search add a minus sign prior to site ie. vista -site,microsoft.com filters out results from the microsoft sites.

(21) Search google cache – To find sites which arnt responding or no longer available then you can search googles cache (copy of the internet). Type cache, infront your search string or select cache link at the bottom of the results page.

Designing a robots.txt file

August 9, 2009 by  
Filed under Google

Search engines collate an image of the internet by searching it, using a special program called a spider. This spider, sometimes called bot, retrieves a map of the internet and all its web pages and files. This map is then used as data to compile results for queries we type into the search engines like Google and Yahoo.

The robots.txt file sits in the root of your web site and tells these search engine bots what NOT to spider. Areas you may not want them to spider and hence not show up in search queries are sensitive pages, areas with no suitable content, images and pages of duplicate content. Indexing the same content twice risks the bots marking it as duplicate content.(monthly archives, category folders and on your front page) Duplicate content usually ends up in search engines supplemental index as opposed to its main index

A Robots.txt file is can be constructed using Notepad and contains statements like below :

To stop all bots indexing your site (indicated by “/”)
User-agent: *
Disallow: /

To block googles image bot scanning the site
User-agent: Googlebot-Image
Disallow: /

To prevent all bots from indexing certain directories
User-agent: *
Disallow: /cgi-bin/
Disallow: /privatedir/
Disallow: /tutorials/blank.htm
Disallow: /file.html

Dissallow all bots from indexing except Alaxa
User-agent: *
Disallow: /
User-agent: ia_archiver
Disallow:

Dissallow all bots except googlebot. This uses the ALLOW term which only google bot knows
User-agent: *
Disallow: /
User-agent: Googlebot
Allow: /

This statement tells the bots where your sitemap is. Sitemap: http://www.yoursite.com/sitemap.xml

Do not leave a robots.txt file empty as some bots will not index your site.

High google ranking

July 20, 2009 by  
Filed under Google

The German company Sistrix analyzed the web page elements of top ranked pages in Google to find out which elements lead to high Google rankings. They analyzed 10,000 random keywords, and for every keyword, they analyzed the top 100 Google search results.

Sistrix analyzed the influence of the following web page elements: web page title, web page body, headline tags, bold and strong tags, image file names, images alt text, domain name, path, parameters, file size, inbound links and PageRank. Keywords in the title tag seem to be important for high rankings on Google. It is also important that the targeted keywords are mentioned in the body tag, although the title tag seems to be more important.

Keywords in H2-H6 headline tags seem to have an influence on the rankings while keywords in H1 headline tags dont seem to have an effect.

Using keywords in bold or strong tags seems to have a slight effect on the top rankings. Web pages that used the keywords in image file names often had higher rankings. The same seems to be true for keywords in image alt attributes.

Websites that use the targeted keyword in the domain name often had high rankings. It might be that these sites get many inbound links with the domain name as the link text.

Keywords in the file path dont seem to have a positive effect on the Google rankings of the analyzed web sites. Web pages that use very few parameters in the URL (?id=123, etc.) or no parameters at all tend to get higher rankings than URLs that contain many parameters.

The file size doesnt seem to influence the ranking of a web page on Google although smaller sites tend to have slightly higher rankings.

Its no surprise that the number of inbound links and the PageRank had a large influence on the page rankings on Google. The top result on Google has usually about four times as many links as result number 11.