Google does not index all of my pages. Why?

Although Google index more than 8 billion web pages, they cannot guarantee that they will crawl all the pages on a particular site. However, Google are always working to increase the number of pages crawled and hope to include more pages in the index soon. For more information about how Google find and include pages in the index please read Google’s Technology Overview.

If your site’s internal link structure does not provide a path to all your pages, the Google robot may not see all the pages on your site. Google follows links from one page to the next, so pages that are not linked to by others may be missed.

Basically, you can’t buy your way into the actual search results. You can however, purchase advertising adjacent to Google results.

Excerpt taken from Google Webmaster Info

My web pages have never been included in the Google index

Google is a mechanized search engine, which employs robots known as ‘spiders’ to crawl the web on a monthly basis and find sites for inclusion in the Google index.

Reasons your site may not be included.

1) Your pages are dynamically generated. Google are able to index dynamically generated pages. However, because the web crawler can easily overwhelm and crash sites serving dynamic content, Google limit the amount of dynamic pages it indexes.

2) You employ doorway pages. Google does not encourage the use of doorway pages. Google want to point users to content pages, not to doorways or splash screens.

3) Your page uses frames. Google supports frames to the extent that it can. Frames tend to cause problems with search engines, bookmarks, emailing links and so on, because frames don’t fit the conceptual model of the web (every page corresponds to a single URL). If a user’s query matches the site as a whole, Google returns the frame set. If a user’s query matches an individual page on the site, Google returns that page. That individual page is not displayed in a frame — because there may be no frame set corresponding to that page.

If you are concerned with the description of your site as seen by search engines, please read “Search Engines and Frames”. It describes the use of the ‘NoFrames’ tag, which is used to provide alternative content. If, instead of providing alternative content, you use wording such as “This site requires the use of frames” or “Upgrade your browser”, then you are excluding both search engines and people who use browsers with frames turned off. (For example, audio web browsers, such as those used in automobiles and by the visually impaired, typically do not deal with frames, which are a visual mechanism.)

Excerpt taken from Google Webmaster Info

Why does my Google pagerank keep changing?

Google update their index about once a month. Each time Google update othe database of web pages, the index invariably shifts: new sites are found, some sites are lost, and sites ranking may change.

Your rank naturally will be affected by changes in the ranking of other sites. You can be assured that no one at Google has hand adjusted the results to boost the ranking of a site. Google’s order of results is automatically determined by several factors, including our PageRank algorithm. Please check out our Technology Overview page for more information on how this works.

You may want to check and see if the number of other sites linking to your URL has changed. This is the single biggest factor in determining what sites are indexed by Google, as Google find most pages when the robots crawl the web and jump from page to page via hyperlinks.

Excerpt taken from Google Webmaster Info

How does Google rank pages?

Google’s order of results is automatically determined by more than 100 factors, including Google’s PageRank algorithm.

Please check out Google’s Technology Overview page for more details. Due to the nature of our business and our interest in protecting the integrity of our search results, this is the only information Google make available to the public about the ranking system.

Excerpt taken from Google Webmaster Info

I don’t want Google to crawl part or all of my site

There is a standard method involving a “robots.txt” file for excluding robot crawlers. This will prevent Googlebot or other crawlers from visiting your site. Googlebot has a user-agent of “Googlebot”. In addition, Googlebot understands some extensions to the robots.txt standard: Disallow patterns may include * to match any sequence of characters, and patterns may end in $ to indicate that the $ must match the end of a name. For example, to prevent Googlebot from crawling files that end in gif, you may use the following robots.txt entry:

User-agent: Googlebot
Disallow: /*.gif$

There is another standard for telling robots not to index a particular web page or follow links on it, which may be more helpful, since it can be used on a page-by-page basis. This method involves placing a “META” element into a page of HTML.

Remember, changing your server’s robots.txt file or changing the “META” elements on its pages will not cause an immediate change in what results Google returns. It is likely that it will take a while for any changes you make to propagate to Google’s next index of the web.

Excerpt taken from Google Webmaster Info