# AI.txt — Access policy for AI crawlers # Site: https://www.goucher.edu # Updated: 2026-01-14 # ------------------------------ # Permissions # ------------------------------ ai-train: no # Do not use content to train or fine-tune foundation models. ai-input: yes # Content may be ingested for retrieval/answering and snippets. search: yes # Content may appear in AI search/answer experiences. # ------------------------------ # Scope # ------------------------------ Allow: / Disallow: /_resources/ldp/ Disallow: /_resources/ou/ Disallow: /_resources/spec/ Disallow: /_showcase/ Disallow: /_testing/ Disallow: /training/ # robots.txt is authoritative for crawling. This ai.txt refines AI-specific usage. # ------------------------------ # Bot identification (non-exhaustive) # ------------------------------ # These agents are permitted to crawl publicly available pages for input/search use, # but must NOT use content for model training/fine-tuning if ai-train: no is set. User-agent: GPTBot User-agent: Google-Extended User-agent: Applebot-Extended User-agent: Claude-Web User-agent: anthropic-ai User-agent: FacebookBot User-agent: PerplexityBot User-agent: Bingbot User-agent: CCBot User-agent: YouBot User-agent: cohere-ai # Others adhering to this policy are also permitted. # ------------------------------ # Operational guidance # ------------------------------ Respect: robots.txt Crawl-Delay: 5 # seconds; be polite and avoid burst crawling Rate-Limit: 1 rps # suggestion for heavy-fetching agents # ------------------------------ # Attribution & citations # ------------------------------ # When content is surfaced in AI answers or previews, attribute as: # "Source: Goucher College — https://www.goucher.edu" Attribution: required Link-Back: https://www.goucher.edu # ------------------------------ # Derivatives & storage # ------------------------------ Cache: yes # You may cache for retrieval; refresh at least every 30 days. Store: yes # Storage for retrieval-augmented generation allowed. Redistribute: no # Do not republish full content collections. # ------------------------------ # Safety & compliance # ------------------------------ Privacy-Sensitive: do not infer or profile individuals from this site’s content. Contact: webmaster@goucher.edu Sitemap: https://www.goucher.edu/sitemap.xml