UKC, Author at UK-Cheapest.co.uk

I don’t want Google to crawl part or all of my site

November 15, 2011November 15, 2010

There is a standard method involving a “robots.txt” file for excluding robot crawlers. This will prevent Googlebot or other crawlers from visiting your site. Googlebot has a user-agent of “Googlebot”. In addition, Googlebot understands some extensions to the robots.txt standard: Disallow patterns may include * to match any sequence of characters, and patterns may end in $ to indicate that the $ must match the end of a name. For example, to prevent Googlebot from crawling files that end in gif, you may use the following robots.txt entry:

User-agent: Googlebot
Disallow: /*.gif$

There is another standard for telling robots not to index a particular web page or follow links on it, which may be more helpful, since it can be used on a page-by-page basis. This method involves placing a “META” element into a page of HTML.

Remember, changing your server’s robots.txt file or changing the “META” elements on its pages will not cause an immediate change in what results Google returns. It is likely that it will take a while for any changes you make to propagate to Google’s next index of the web.

Excerpt taken from Google Webmaster Info

I don’t want Google to cache my site

November 15, 2011November 15, 2010

Google automatically takes a “snapshot” of each page it crawls and caches it. This enables us to show the search terms highlighted on text heavy pages so users can find relevant information quickly, and to retrieve pages for users if the site’s server temporarily fails. Users can access the cached version by choosing the “Cached” link on the search results page. If you do not want your content to be accessible through Google’s cache, you can use the NOARCHIVE meta-tag. Place this in the section of your documents:

This tag will tell robots not to archive the page. Google will continue to index and follow links from the page, but will not present cached material to users.

If you want to allow other robots to archive your content, but prevent Google’s robots from caching, you can use the following tag:

Note that the change will occur the next time Google crawls the page containing the NOARCHIVE tag (typically about once a month). If you want the change to take effect sooner than this, the site owner must contact us and request immediate removal of archived content. Also, the NOARCHIVE directive only controls whether the cached page is shown. To control whether the page is indexed, use the NOINDEX tag; to control whether links are followed, use the NOFOLLOW tag.

Excerpt taken from Google Webmaster Info

How do I remove my site from Google?

November 15, 2011November 15, 2010

Except in instances involving legal issues or spam, Google’s policy for removing a page from our index requires that we obtain the permission of that page’s webmaster.

This prevents competitors from sabotaging each other’s listings. Please have the webmaster for the page in question contact us with proof that he/she is indeed the webmaster.

This proof must be in the form of a root level page on the site in question, requesting removal from Google. Once we receive the URL that corresponds with this root level page, we will remove the offending page from our index.

For more information on this process, please see http://www.google.com/remove.html

Excerpt taken from Google Webmaster Info

I’m listed in Google but not for my keywords

November 15, 2011November 15, 2010

Google does not manually assign keywords to your site, nor do Google manually “boost” the rankings of any site. The ranking process is completely automated and depends on the relative PageRank of each result found.

The best way to improve your position in results is to have relevant content and multiple links from other web sites. If there are certain keywords you feel are essential to your site’s success, you may want to consider a targeted keyword advertising program.

Google does not sell placement in the results, but Google do have advertising positions available adjacent to them.

Excerpt taken from Google Webmaster Info

Google is not showing a description of my site

November 15, 2011November 15, 2010

Site descriptions in Google results are actually quoted from the web page in question. Google automatically generates different descriptions based on the search terms used to find the site (these “snippets” display the search term(s) in the context of the page on which they appear).

For example, if there is a pet site that deals with cats and dogs, and someone enters a search for the word ‘dog,’ the site description on Google will only talk about ‘dogs.’ If a person searches Google for ‘cats’ and the same site is delivered as a result, the description will be different – it will contain references to the word ‘cat’ as it appears on the website.

Google does not display a standard description. We look for the search terms specified (and in some cases, variations of those terms) and show snippets of where those terms appear. This is a completely automated process and editing is not an option. If you alter the relevant text on the page itself, Google will pick up those changes during our next crawl in a few weeks.

Excerpt taken from Google Webmaster Info