# As a condition of accessing this website, you agree to abide by the following # content signals: # (a) If a Content-Signal = yes, you may collect content for the corresponding # use. # (b) If a Content-Signal = no, you may not collect content for the # corresponding use. # (c) If the website operator does not include a Content-Signal for a # corresponding use, the website operator neither grants nor restricts # permission via Content-Signal with respect to the corresponding use. # The content signals and their meanings are: # search: building a search index and providing search results (e.g., returning # hyperlinks and short excerpts from your website's contents). Search does not # include providing AI-generated search summaries. # ai-input: inputting content into one or more AI models (e.g., retrieval # augmented generation, grounding, or other real-time taking of content for # generative AI search answers). # ai-train: training or fine-tuning AI models. # ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF # RIGHTS UNDER ARTICLE 4 OF THE EUROPEAN UNION DIRECTIVE 2019/790 ON COPYRIGHT # AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET. # BEGIN Cloudflare Managed content User-agent: * Content-Signal: search=yes,ai-train=no Allow: / User-agent: Amazonbot Disallow: / User-agent: Applebot-Extended Disallow: / User-agent: Bytespider Disallow: / User-agent: CCBot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: Google-Extended Disallow: / User-agent: GPTBot Disallow: / User-agent: meta-externalagent Disallow: / # END Cloudflare Managed Content # ======================================================== # AJTraining 网站爬虫访问控制策略 # 网站: https://prawnikpabianice.pl/ # 最后更新: 2026年01月08日 (整合版) # ======================================================== # ==================== 基础访问规则 ==================== # 屏蔽资源文件和系统路径,防止泄露目录结构 User-agent: * Disallow: /css/ Disallow: /cookies/ Disallow: /files/ Disallow: /video/ Disallow: /images/ Disallow: /js/ Disallow: /assets/ Disallow: /static/ Disallow: /uploads/ Disallow: /fonts/ Disallow: /media/ Disallow: /vendor/ Disallow: /plugins/ Disallow: /temp/ Disallow: /cache/ Disallow: /sors.txt Disallow: /admin/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp-content/plugins/ Disallow: /wp-content/themes/ Disallow: /config/ Disallow: /backend/ Disallow: /controlpanel/ Disallow: /dashboard/ Disallow: /database/ Disallow: /sql/ Disallow: /private/ Disallow: /secure/ Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /logs/ Disallow: /backup/ Disallow: /old/ Disallow: /test/ Disallow: /dev/ Disallow: /staging/ Disallow: /api/ # ==================== 安全监控陷阱路径 ==================== # 用于识别恶意扫描器的诱饵路径 Disallow: /honeypot-43-173/ Disallow: /blocked-ip-monitor/ Disallow: /suspicious-access-log/ Disallow: /crawler-trap-2024/ Disallow: /security-monitoring/ Disallow: /bad-bot-detector/ Disallow: /ip-monitoring-log/ Disallow: /malicious-crawler-log/ Disallow: /spider-trap-zone/ Disallow: /honeypot-admin/ Disallow: /fake-wp-login/ Disallow: /dummy-config/ Disallow: /fake-backup-zip/ Disallow: /phpmyadmin/ Disallow: /mysql/ Disallow: /administrator/ Disallow: /joomla/ Disallow: /drupal/ Disallow: /wordpress/ Disallow: /wp-login.php Disallow: /xmlrpc.php # ==================== 搜索引擎爬虫设置 ==================== # Google 系列 User-agent: Googlebot Crawl-delay: 5 Request-rate: 1/5 Allow: / User-agent: Googlebot-Image Allow: / # 其他主流搜索 User-agent: Bingbot User-agent: YandexBot User-agent: DuckDuckBot User-agent: Applebot Crawl-delay: 10 Allow: / # 中文搜索 User-agent: Baiduspider User-agent: 360Spider User-agent: Sogou User-agent: Sogou Spider User-agent: Sogou web spider User-agent: YisouSpider Crawl-delay: 10 Allow: / # ==================== AI/大模型爬虫策略 ==================== # 允许部分知名 AI 用于索引(可根据隐私需求调整) User-agent: GPTBot User-agent: ChatGPT-User User-agent: Google-Extended User-agent: AnthropicBot User-agent: ClaudeBot User-agent: CCBot User-agent: PerplexityBot Crawl-delay: 10 Allow: / # ==================== 社交媒体预览 ==================== User-agent: facebookexternalhit User-agent: Facebot User-agent: Twitterbot User-agent: Pinterestbot User-agent: LinkedInBot User-agent: WhatsApp User-agent: TelegramBot User-agent: Discordbot Allow: / # ======================================================== # 严格封禁区 (SEO工具 / 采集类 / 恶意行为) # ======================================================== # 数据分析 / SEO 工具 User-agent: AhrefsBot User-agent: SemrushBot User-agent: MJ12bot User-agent: DotBot User-agent: BLEXBot User-agent: DataForSeoBot User-agent: SeznamBot User-agent: ScreamingFrog User-agent: SiteAuditBot User-agent: XoviBot User-agent: MegaIndex User-agent: Spbot User-agent: ZoominfoBot User-agent: ProWebWalker User-agent: Barkrowler User-agent: KouContentAnalytics Disallow: / # 商城 / 采集 / 开发工具 (UA 伪装常客) User-agent: Amazonbot User-agent: AlibabaBot User-agent: AliyunSecBot User-agent: ShopBot User-agent: PriceSpider User-agent: Crawlera User-agent: Python-urllib User-agent: curl User-agent: wget User-agent: Scrapy User-agent: PycURL Disallow: / # 垃圾爬虫 / 资源消耗型 User-agent: PetalBot User-agent: Bytespider User-agent: Exabot User-agent: MauiBot User-agent: SentiBot User-agent: Trendictionbot User-agent: Gigabot User-agent: AspiegelBot User-agent: BUbiNG User-agent: CensysInspect User-agent: GrapeshotCrawler User-agent: NetcraftSurveyAgent User-agent: LinkpadBot User-agent: DomainCrawler User-agent: Linguee User-agent: Lipperhey Disallow: / # 站点地图 Sitemap: https://prawnikpabianice.pl/sitemap.xml