Free Robots.txt Generator

Control how search engines crawl and index your website with a properly configured `robots.txt` file.

1. Default Policy for All Bots (`User-agent: *`)
2. Sitemap URL (Optional, but Recommended)
3. Specific Rules for Bots
Generated robots.txt

                
Advertisement

About the Robots.txt File

A `robots.txt` file is a powerful tool for webmasters. It's a plain text file placed in the root directory of your website that instructs search engine crawlers on which parts of your site they should or should not access.

Common Directives Explained

User-agent:
Specifies the crawler the rule applies to. `*` is a wildcard for all crawlers. Examples: `Googlebot`, `Bingbot`.
Disallow:
The path of a file or directory you want to block. For example, `/admin/` blocks the entire admin directory.
Allow:
Used to specify an exception within a disallowed directory. For example, you might disallow `/media/` but allow `/media/public/`.
Sitemap:
Provides the absolute URL to your XML sitemap, helping crawlers discover all your important pages.

Frequently Asked Questions (FAQ)

1. What's the difference between `robots.txt` and a `noindex` meta tag?

A `robots.txt` file **prevents crawling**, while a `noindex` tag **prevents indexing**. If you block a page with `robots.txt`, Google won't see the `noindex` tag and might still index the page if it finds links to it from other sites. If you want to securely remove a page from search results, remove it from `robots.txt` and use a `noindex` meta tag instead.

2. How do I block a specific file?

You can block a specific file by listing its full path. For example, to block a PDF file, you would add: `Disallow: /path/to/your/file.pdf`.

3. Can I use wildcards like `*`?

Yes, `*` can be used as a wildcard to match any sequence of characters, and `$` can be used to mark the end of a URL. For example, `Disallow: /*.pdf$` would block all URLs that end with `.pdf`.