mirror of
https://github.com/Sped0n/bridget.git
synced 2026-04-14 10:09:31 -07:00
The robots.txt file is added to the layouts directory. This file includes rules to block specific bots from crawling the site. The following bots are blocked: MJ12bot, AhrefsBot, BLEXBot, SISTRIX Crawler, sistrix, 007ac9, 007ac9 Crawler, UptimeRobot/2.0, Ezooms Robot, Perl LWP, netEstate NE Crawler (+http://www.website-datenbank.de/), WiseGuys Robot, Turnitin Robot, Heritrix, pricepi, SurdotlyBot, and ZoominfoBot. All other bots are allowed to crawl the site. The file also includes a sitemap directive to point to the sitemap.xml file.
62 lines
944 B
Plaintext
62 lines
944 B
Plaintext
User-agent: MJ12bot
|
|
Disallow: /
|
|
|
|
User-agent: AhrefsBot
|
|
Disallow: /
|
|
|
|
User-agent: BLEXBot
|
|
Disallow: /
|
|
|
|
# Block SISTRIX
|
|
User-agent: SISTRIX Crawler
|
|
Disallow: /
|
|
User-agent: sistrix
|
|
Disallow: /
|
|
User-agent: 007ac9
|
|
Disallow: /
|
|
User-agent: 007ac9 Crawler
|
|
Disallow: /
|
|
|
|
# Block Uptime robot
|
|
User-agent: UptimeRobot/2.0
|
|
Disallow: /
|
|
|
|
# Block Ezooms Robot
|
|
User-agent: Ezooms Robot
|
|
Disallow: /
|
|
|
|
# Block Perl LWP
|
|
User-agent: Perl LWP
|
|
Disallow: /
|
|
|
|
# Block netEstate NE Crawler (+http://www.website-datenbank.de/)
|
|
User-agent: netEstate NE Crawler (+http://www.website-datenbank.de/)
|
|
Disallow: /
|
|
|
|
# Block WiseGuys Robot
|
|
User-agent: WiseGuys Robot
|
|
Disallow: /
|
|
|
|
# Block Turnitin Robot
|
|
User-agent: Turnitin Robot
|
|
Disallow: /
|
|
|
|
# Block Heritrix
|
|
User-agent: Heritrix
|
|
Disallow: /
|
|
|
|
# Block pricepi
|
|
User-agent: pimonster
|
|
Disallow: /
|
|
|
|
User-agent: SurdotlyBot
|
|
Disallow: /
|
|
|
|
User-agent: ZoominfoBot
|
|
Disallow: /
|
|
|
|
User-agent: *
|
|
Allow: /
|
|
|
|
Sitemap: {{ "/sitemap.xml" | absURL }}
|