Robots.txt / SEO Terms Robots.txt Robots.txt is a plain text file placed at the root of a website that tells search engine crawlers which pages or directories they are allowed or not allowed to access. It serves as a set of instructions for web robots, helping website owners control how their content is crawled and indexed. While not mandatory for all crawlers to follow, most major search engines, including Google and Bing, respect these directives. The file can be used to block access to sensitive or duplicate content, conserve crawl budget or guide bots to specific sitemaps. For example, a robots.txt file might prevent crawlers from indexing admin pages, internal search results or development sections of a site. A typical directive might look like this: User-agent: * Disallow: /private-folder/ This tells all bots not to crawl the specified folder. Additional rules can be added to allow or disallow specific bots or directories, and to link to an XML sitemap for easier discovery of content. For B2B and SaaS companies, robots.txt helps manage indexing of support pages, customer portals or gated content. For nonprofits, it ensures that confidential pages or staff-only resources are not exposed in search results. A misconfigured file, however, can unintentionally block important pages, leading to SEO issues. It is important to regularly audit robots.txt settings and test them using tools like Google Search Console to ensure they align with your site’s indexing goals.