Using Robots.txt To Help Google Images Index Your Website

RobotsThere’s allot of webmasters out there who utilize the power of the robots.txt file, and use it to block search engine spiders from crawling their cms folders. At JVF we use this method in order to not reveal .js, .css, or any custom coding we use to the search engines. When implementing the Disallow feature within robots.txt we want to warn you that any images within the folder you Disallowed will not be indexed properly within Google Images. To ensure that the photos and images on your website are crawled and indexed properly, be sure to use the Allow rule. You can see an example of this when you take a look at what Google is doing in their robots.txt file.

They first Disallow anything within their safebrowsing folder, then Allow specific folders which they want crawled and indexed.

screenshot-www jvfconsulting com 2015-02-20 12-09-00
If you were to use this same scenario for your own website using a content management system, it should look something like this:

screenshot-www jvfconsulting com 2015-02-20 12-09-34

  • Do i know why my site not properly not indexing even i use meta name=”robots” content=”index,follow” on every page but still there are many pages which are not index yet.

  • How to I edit the robots txt file updated on my website such that I can allow robots to index a part of my web server? Please help me with possible steps. Urgent!

  • This fix totally saved my website when it comes to organic search results in the Google images section. Everyone who has lots of images on their website should implement this fix immediately.

  • let’ say I have a Folder called FA , and in this forlder there are 100 files, one file I want to allw and rest not, can I type:
    Disallow: /FA
    Allow : /FA/file.aspx

    Thank you

  • Hi Vilen. Indeed, but it’s generally advised to use Disallow: instead.

    The method displayed in this post is to disallow access to a folder *except* image subfolders.

    Hope this helps…

  • A robots.txt is a file placed on your server to tell the various search engine spiders not to crawl or index certain sections or pages of your site.It is a regular text file that through its name, has special meaning to the majority of “honorable” robots on the web.Among the most important things you can do is check your pages that are in Google’s supplemental index. This is where you’ll find lots of your low-quality pages, ripe for removal by robots.txt. If the pages don’t contain useful information, dump them

  • meta name=”robots” content=”index,follow”
    User-agent: *
    Allow: /

    Doesn’t it mean that everything are considered for SE crawling? Doesn’t it include images either?