Last active
March 9, 2026 10:00
-
-
Save arenagroove/f2c47797673ab2237f3f75a7227671f2 to your computer and use it in GitHub Desktop.
WordPress MU plugin — unified llms.txt + per-page .md endpoint for LLM content serving. Lazy-generated Markdown with HTML→MD converter, shared exclusions, YAML frontmatter, Polylang support, rate limiting, and a single settings page. Built for Classic Editor + ACF sites using the Less Rain framework.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <?php | |
| /** | |
| * Plugin Name: LR LLMs Generator | |
| * Description: Generates a multilingual llms.txt for LLM indexing and serves clean .md endpoints for every public post/page. Unified settings under Tools. | |
| * Version: 3.4.2 | |
| * Author: Luis Martinez | |
| * Author URI: https://www.lessrain.com | |
| * Requires at least: 5.6 | |
| * Tested up to: 6.5 | |
| * Requires PHP: 7.4 | |
| * | |
| * Changelog: | |
| * 3.4.2 — 2026-03-09 | |
| * - Homepage URLs in llms.txt now correctly output as index.md / en/index.md | |
| * instead of site.com.md — detected by comparing against home_url() and | |
| * Polylang language home URLs | |
| * - get_option() comparisons hardened with (string) cast throughout — fixes | |
| * strict === '1' check failing when WP stores option as integer | |
| * 3.4 — 2026-03-09 | |
| * - llms.txt entry URLs now point to .md endpoints when the .md feature is | |
| * enabled — AIs follow links directly to clean Markdown instead of HTML. | |
| * Falls back to canonical HTML URLs when .md is disabled. | |
| * 3.3 — 2026-03-09 | |
| * - Self-closing block elements (<div />, <section />, etc.) now normalized to explicit | |
| * open+close pairs before DOMDocument parsing — root cause of content duplication and | |
| * missing sections on pages with swiper/page-builder markup | |
| * - Include-class whitelist check moved from parent-node lookup to per-text-node ancestor | |
| * walk — text inside deeply nested or structurally irregular markup now correctly | |
| * included if any ancestor carries the include class (e.g. text-container) | |
| * - $is_included closure removed; logic inlined in DOMText handler | |
| * - form removed from $skip_tags — form containers often wrap meaningful text; form | |
| * controls (input, textarea, select, button, etc.) suppressed via separate $skip_form_tags | |
| * - lr_llms_md_skip_tags filter added — skip tags list is now overridable per-site | |
| * - Title encoding fixed: get_post_field('post_title', $id, 'raw') replaces get_the_title() | |
| * to bypass Polylang filter chain re-encoding (& → &) | |
| * - lr_llms_clean_text() helper: html_entity_decode with ENT_QUOTES|ENT_HTML5 + soft | |
| * hyphen (U+00AD) stripping | |
| * - Multilingual .md canonical URL: lr_llms_md_get_requested_content_url() builds frontend | |
| * URL from request path, preserving language prefixes (/en/, /de/, etc.) | |
| * - index.md resolution extended to language-prefixed paths (/en/index.md, /de/index.md) | |
| * 3.2 — 2026-03-08 | |
| * - Fixed broken cron purge LIKE query — was searching wrong prefix, silently deleting | |
| * nothing; now correctly targets _transient_ and _transient_timeout_ rows | |
| * - sanitize_url() → esc_url_raw() for canonical WP core alignment | |
| * - apply_filters() wired up in lr_llms_md_get_skip_classes() and | |
| * lr_llms_md_get_include_classes() — documented filters were never actually called | |
| * - Added status_header(200) before Content-Type in lr_llms_md_serve() for correct | |
| * behaviour under reverse proxies | |
| * - Rate limit message updated to match actual 30s timeout | |
| * 3.1 — 2026-03-08 | |
| * - Headings (h1–h6) now extracted regardless of include-class whitelist — picks up | |
| * visually-hidden landmark headings outside text-container | |
| * - Removed explicit # Title prepend — h1 is sourced from the DOM instead | |
| * - YAML frontmatter title and description values are now properly quoted to handle | |
| * colons and special characters in YAML parsers | |
| * - Polylang null guard on cache key — pll_current_language('slug') falls back to | |
| * get_locale() when called during early bootstrap | |
| * - Added lr-no-extract-md CSS class as a dedicated .md-only skip signal (editor-side) | |
| * - Added media-overlay-stack to default skip classes (hero overlay text) | |
| * - SVG images skipped in converter (always decorative icons in LR sites) | |
| * - Excluded/missing .md requests now return to WordPress for proper 404 template rendering | |
| * | |
| * 3.0 — Initial unified release | |
| * - Merged lr-llms-txt-generator.php and lr-llms-md-generator.php into single plugin | |
| * - Shared exclusions (post types, IDs, page templates) across both endpoints | |
| * - Lazy .md generation with transient cache keyed by post_id + post_modified hash | |
| * - YAML frontmatter on .md output (title, url, last_modified, type, description) | |
| * - Language-aware cache keys for Polylang / WPML | |
| * - Rate limiting with 30s timeout for paginated crawlers | |
| * - Custom user-agent on internal wp_remote_get self-fetch | |
| * - lr_llms_md_html_source filter for site-specific HTML pre-processing | |
| * - lr_llms_md_output filter for final Markdown post-processing | |
| * - Single admin settings page under Tools > LR LLMs Settings | |
| */ | |
| if ( ! defined( 'ABSPATH' ) ) { | |
| exit; | |
| } | |
| // ============================================================================= | |
| // A. LLMS.TXT — CONFIGURATION | |
| // ============================================================================= | |
| global $lr_llms_config; | |
| $lr_llms_config = [ | |
| 'prefix' => 'lr_llms', | |
| // Shared option keys (also used by .md) | |
| 'setting_key_exclude_types' => 'lr_llms_exclude_post_types', | |
| 'setting_key_exclude_ids' => 'lr_llms_exclude_ids', | |
| 'setting_key_exclude_templates' => 'lr_llms_exclude_templates', | |
| // llms.txt display options | |
| 'setting_key_show_headings' => 'lr_llms_show_headings', | |
| 'setting_key_show_descriptions' => 'lr_llms_show_descriptions', | |
| // Caching | |
| 'cache_prefix' => 'lr_llms_cache_', | |
| 'cache_timeout' => HOUR_IN_SECONDS, | |
| 'transient_flag_flush' => 'lr_llms_flush_needed', | |
| // Rate limiting | |
| 'rate_limit_timeout' => 30, // seconds — short enough to not block paginated crawlers, long enough to deter abuse | |
| 'rate_limit_http_status' => 429, | |
| // Output control | |
| 'post_type_priority_order' => [ 'page', 'post' ], | |
| 'max_items' => 1000, | |
| 'max_length_chars' => 3000, | |
| 'default_page_size' => 200, | |
| // Query parameters | |
| 'query_param_flush' => 'flush', | |
| // REST API | |
| 'rest_namespace' => 'lr-llms/v1', | |
| 'rest_txt_route' => 'txt', | |
| // Cron | |
| 'cron_event' => 'lr_llms_purge_transients_hook', | |
| 'cron_frequency' => 'daily', | |
| // Dev flags | |
| 'dev_purge_enabled' => defined( 'LR_LLMS_DEV_PURGE' ) && LR_LLMS_DEV_PURGE, | |
| 'disable_rate_limit' => defined( 'LR_LLMS_DISABLE_RATE_LIMIT' ) && LR_LLMS_DISABLE_RATE_LIMIT, | |
| // HTTP headers for llms.txt response | |
| 'headers_txt_response' => [ | |
| 'Content-Type' => 'text/plain; charset=utf-8', | |
| 'X-Robots-Tag' => 'index, follow', | |
| 'Cache-Control' => 'no-store, must-revalidate', | |
| 'Pragma' => 'no-cache', | |
| 'Expires' => '0', | |
| ], | |
| // Filter hooks | |
| 'filter_included_post_ids' => 'lr_llms_included_post_ids', | |
| 'filter_post_type_priority' => 'lr_llms_post_type_priority_order', | |
| 'filter_contact_details' => 'lr_llms_contact_details', | |
| 'filter_post_type_label' => 'lr_llms_post_type_label', | |
| 'filter_post_title' => 'lr_llms_post_title', | |
| 'filter_post_url' => 'lr_llms_post_url', | |
| 'filter_post_description' => 'lr_llms_post_description', | |
| ]; | |
| // ============================================================================= | |
| // B. MD ENDPOINT — CONFIGURATION | |
| // ============================================================================= | |
| define( 'LR_LLMS_MD_OPT_ENABLED', 'lr_llms_md_enabled' ); | |
| define( 'LR_LLMS_MD_OPT_SKIP_CLASSES', 'lr_llms_md_skip_classes' ); | |
| define( 'LR_LLMS_MD_OPT_INC_CLASSES', 'lr_llms_md_include_classes' ); | |
| define( 'LR_LLMS_MD_OPT_LAZY_LOAD', 'lr_llms_md_lazy_load' ); | |
| define( 'LR_LLMS_MD_OPT_DEDUP_LINKS', 'lr_llms_md_dedup_links' ); | |
| // Default skip classes — tuned for LR framework sites | |
| define( 'LR_LLMS_MD_DEFAULT_SKIP', implode( "\n", [ | |
| 'lr-no-extract-text', | |
| 'lr-no-extract-md', | |
| 'navigation-bar', | |
| 'main-footer', | |
| 'offcanvas-footer', | |
| 'site-header', | |
| 'site-footer', | |
| 'breadcrumb', | |
| 'pagination', | |
| 'cookie-banner', | |
| 'button-container', | |
| 'call-to-action', | |
| 'img-link', | |
| 'meta', | |
| 'terms', | |
| 'subtitle', | |
| 'caption', | |
| 'media-overlay-stack', | |
| ] ) ); | |
| // Default include classes — text-container is the standard LR content wrapper | |
| define( 'LR_LLMS_MD_DEFAULT_INC', 'text-container' ); | |
| // ============================================================================= | |
| // C. SHARED HELPERS | |
| // ============================================================================= | |
| /** | |
| * Get shared excluded post types (applies to both llms.txt and .md). | |
| */ | |
| function lr_llms_get_excluded_types(): array { | |
| global $lr_llms_config; | |
| return (array) get_option( $lr_llms_config['setting_key_exclude_types'], [] ); | |
| } | |
| /** | |
| * Get shared excluded post IDs (applies to both llms.txt and .md). | |
| */ | |
| function lr_llms_get_excluded_ids(): array { | |
| global $lr_llms_config; | |
| $raw = get_option( $lr_llms_config['setting_key_exclude_ids'], '' ); | |
| return array_map( 'intval', array_filter( array_map( 'trim', explode( ',', $raw ) ) ) ); | |
| } | |
| /** | |
| * Get excluded page templates (applies to llms.txt queries). | |
| */ | |
| function lr_llms_get_excluded_templates(): array { | |
| global $lr_llms_config; | |
| return (array) get_option( $lr_llms_config['setting_key_exclude_templates'], [] ); | |
| } | |
| /** | |
| * Get all registered page templates across all public post types. | |
| * Returns [ 'template-file.php' => 'Template Name' ] | |
| */ | |
| function lr_llms_get_all_page_templates(): array { | |
| $templates = []; | |
| $post_types = get_post_types( [ 'public' => true ], 'names' ); | |
| $theme = wp_get_theme(); | |
| foreach ( $post_types as $pt ) { | |
| $pt_templates = $theme->get_page_templates( null, $pt ); | |
| foreach ( $pt_templates as $file => $name ) { | |
| if ( ! isset( $templates[ $file ] ) ) { | |
| $templates[ $file ] = $name; | |
| } | |
| } | |
| } | |
| asort( $templates ); | |
| return $templates; | |
| } | |
| /** | |
| * Parse a newline-separated textarea value into a clean array. | |
| */ | |
| function lr_llms_parse_lines( string $raw ): array { | |
| return array_values( array_filter( array_map( 'trim', explode( "\n", $raw ) ) ) ); | |
| } | |
| /** | |
| * Decode HTML entities and strip soft hyphens from WP text fields. | |
| * get_the_title() and get_the_excerpt() return HTML-encoded strings intended | |
| * for HTML output — this produces clean plain text for llms.txt and .md. | |
| */ | |
| function lr_llms_clean_text( string $text ): string { | |
| $decoded = html_entity_decode( $text, ENT_QUOTES | ENT_HTML5, 'UTF-8' ); | |
| return str_replace( "\u{00AD}", '', $decoded ); // strip soft hyphens | |
| } | |
| // ============================================================================= | |
| // D. SHARED CACHE | |
| // ============================================================================= | |
| /** | |
| * Flush all llms.txt transient caches. | |
| */ | |
| function lr_llms_flush_txt_cache(): void { | |
| global $wpdb, $lr_llms_config; | |
| $val_like = $wpdb->esc_like( '_transient_' . $lr_llms_config['cache_prefix'] ) . '%'; | |
| $time_like = $wpdb->esc_like( '_transient_timeout_' . $lr_llms_config['cache_prefix'] ) . '%'; | |
| $wpdb->query( $wpdb->prepare( "DELETE FROM {$wpdb->options} WHERE option_name LIKE %s", $val_like ) ); | |
| $wpdb->query( $wpdb->prepare( "DELETE FROM {$wpdb->options} WHERE option_name LIKE %s", $time_like ) ); | |
| delete_transient( $lr_llms_config['transient_flag_flush'] ); | |
| } | |
| /** | |
| * Flush all .md transient caches. | |
| */ | |
| function lr_llms_flush_md_cache(): void { | |
| global $wpdb; | |
| $val_like = $wpdb->esc_like( '_transient_lr_llms_md_' ) . '%'; | |
| $time_like = $wpdb->esc_like( '_transient_timeout_lr_llms_md_' ) . '%'; | |
| $wpdb->query( $wpdb->prepare( "DELETE FROM {$wpdb->options} WHERE option_name LIKE %s", $val_like ) ); | |
| $wpdb->query( $wpdb->prepare( "DELETE FROM {$wpdb->options} WHERE option_name LIKE %s", $time_like ) ); | |
| } | |
| /** | |
| * Flush all caches (both llms.txt and .md). | |
| */ | |
| function lr_llms_flush_all_caches(): void { | |
| lr_llms_flush_txt_cache(); | |
| lr_llms_flush_md_cache(); | |
| } | |
| /** | |
| * Flag llms.txt cache for flush after post save. | |
| */ | |
| function lr_llms_flag_cache_flush_on_save( int $post_id ): void { | |
| if ( ( defined( 'DOING_AUTOSAVE' ) && DOING_AUTOSAVE ) || wp_is_post_revision( $post_id ) ) { | |
| return; | |
| } | |
| global $lr_llms_config; | |
| set_transient( $lr_llms_config['transient_flag_flush'], true, 5 * MINUTE_IN_SECONDS ); | |
| } | |
| add_action( 'save_post', 'lr_llms_flag_cache_flush_on_save' ); | |
| /** | |
| * Check flush flag on init and clear if set. | |
| */ | |
| add_action( 'init', function () { | |
| global $lr_llms_config; | |
| if ( get_transient( $lr_llms_config['transient_flag_flush'] ) ) { | |
| lr_llms_flush_txt_cache(); | |
| } | |
| } ); | |
| // ============================================================================= | |
| // E. LLMS.TXT — CORE | |
| // ============================================================================= | |
| /** | |
| * Generate a cache key from the current query string and active language. | |
| * | |
| * Language context is injected explicitly to prevent Polylang/WPML cookie- | |
| * switched requests from sharing cache across languages when $_GET is identical. | |
| */ | |
| function lr_llms_get_cache_key( ?array $get = null ): string { | |
| global $lr_llms_config; | |
| if ( $get === null ) { | |
| $get = $_GET; | |
| } | |
| unset( | |
| $get[ $lr_llms_config['query_param_flush'] ], | |
| $get['fbclid'], | |
| $get['utm_source'], | |
| $get['utm_medium'], | |
| $get['utm_campaign'] | |
| ); | |
| // Include active language so cookie-switched Polylang/WPML requests | |
| // don't collide with each other when query params are otherwise identical. | |
| // pll_current_language() can return null early in bootstrap — fall back to get_locale(). | |
| $lang = function_exists( 'pll_current_language' ) ? pll_current_language( 'slug' ) : null; | |
| $get['_lang'] = $lang ?: get_locale(); | |
| ksort( $get ); | |
| return md5( http_build_query( $get ) ); | |
| } | |
| /** | |
| * Check if the current request is for /llms.txt. | |
| */ | |
| function lr_llms_is_llms_txt_request(): bool { | |
| $path = parse_url( $_SERVER['REQUEST_URI'], PHP_URL_PATH ); | |
| $path = untrailingslashit( strtolower( $path ) ); | |
| if ( defined( 'LR_LLMS_LOCAL_PATH_SUFFIX' ) ) { | |
| return str_ends_with( $path, '/' . ltrim( LR_LLMS_LOCAL_PATH_SUFFIX, '/' ) ); | |
| } | |
| return ( $path === '/llms.txt' ); | |
| } | |
| /** | |
| * Build the base URL for the Next Page hint. | |
| */ | |
| if ( ! function_exists( 'lr_llms_build_base_url' ) ) { | |
| function lr_llms_build_base_url( array $cfg ): string { | |
| $path = parse_url( $_SERVER['REQUEST_URI'] ?? '', PHP_URL_PATH ); | |
| if ( is_string( $path ) && substr( $path, -9 ) === '/llms.txt' ) { | |
| return home_url( '/llms.txt' ); | |
| } | |
| return rest_url( rtrim( $cfg['rest_namespace'] ?? '', '/' ) . '/txt' ); | |
| } | |
| } | |
| /** | |
| * Intercept /llms.txt requests. | |
| */ | |
| function lr_llms_handle_llms_txt_request(): void { | |
| if ( ! lr_llms_is_llms_txt_request() || is_admin() ) { | |
| return; | |
| } | |
| lr_llms_serve_txt_output(); | |
| exit; | |
| } | |
| add_action( 'init', 'lr_llms_handle_llms_txt_request', 99 ); | |
| /** | |
| * Serve llms.txt output with caching and rate limiting. | |
| */ | |
| function lr_llms_serve_txt_output(): void { | |
| global $lr_llms_config; | |
| foreach ( $lr_llms_config['headers_txt_response'] as $key => $value ) { | |
| header( "{$key}: {$value}" ); | |
| } | |
| // Rate limiting | |
| if ( ! $lr_llms_config['disable_rate_limit'] ) { | |
| $ip = $_SERVER['REMOTE_ADDR'] ?? 'unknown'; | |
| $key = $lr_llms_config['prefix'] . '_rl_' . md5( $ip ); | |
| if ( get_transient( $key ) ) { | |
| status_header( $lr_llms_config['rate_limit_http_status'] ); | |
| echo "Too many requests – please retry shortly."; | |
| exit; | |
| } | |
| set_transient( $key, 1, $lr_llms_config['rate_limit_timeout'] ); | |
| } | |
| $max_items = $lr_llms_config['max_items']; | |
| $limit = isset( $_GET['limit'] ) ? min( max( 1, intval( $_GET['limit'] ) ), $max_items ) : $lr_llms_config['default_page_size']; | |
| $page = isset( $_GET['page'] ) ? max( 1, intval( $_GET['page'] ) ) : 1; | |
| $since = isset( $_GET['since'] ) ? sanitize_text_field( $_GET['since'] ) : null; | |
| $tag = isset( $_GET['tag'] ) ? sanitize_text_field( $_GET['tag'] ) : null; | |
| $lang = isset( $_GET['lang'] ) ? sanitize_text_field( $_GET['lang'] ) : null; | |
| $offset = ( $page - 1 ) * $limit; | |
| if ( ! empty( $_GET[ $lr_llms_config['query_param_flush'] ] ) ) { | |
| lr_llms_flush_txt_cache(); | |
| } | |
| $cache_key = $lr_llms_config['cache_prefix'] . lr_llms_get_cache_key(); | |
| $cached_output = get_transient( $cache_key ); | |
| if ( $cached_output ) { | |
| echo $cached_output; | |
| return; | |
| } | |
| lr_llms_generate_output_body( $limit, $page, $offset, $since, $tag, $lang, $cache_key ); | |
| } | |
| /** | |
| * REST API fallback for llms.txt output. | |
| */ | |
| function lr_llms_rest_output(): WP_REST_Response { | |
| ob_start(); | |
| lr_llms_serve_txt_output(); | |
| $output = ob_get_clean(); | |
| global $lr_llms_config; | |
| if ( mb_strlen( $output ) > $lr_llms_config['max_length_chars'] ) { | |
| $output = mb_substr( $output, 0, $lr_llms_config['max_length_chars'] ); | |
| } | |
| return new WP_REST_Response( $output, 200, [ | |
| 'Content-Type' => 'text/plain; charset=utf-8', | |
| 'X-Robots-Tag' => 'index, follow', | |
| ] ); | |
| } | |
| add_action( 'rest_api_init', function () { | |
| global $lr_llms_config; | |
| register_rest_route( $lr_llms_config['rest_namespace'], $lr_llms_config['rest_txt_route'], [ | |
| 'methods' => 'GET', | |
| 'callback' => 'lr_llms_rest_output', | |
| 'permission_callback' => '__return_true', | |
| ] ); | |
| } ); | |
| /** | |
| * Get post types ordered by priority config. | |
| */ | |
| function lr_llms_get_ordered_post_types(): array { | |
| global $lr_llms_config; | |
| $priority_order = apply_filters( | |
| $lr_llms_config['filter_post_type_priority'], | |
| $lr_llms_config['post_type_priority_order'] ?? [ 'page', 'post' ] | |
| ); | |
| $all = get_post_types( [ 'public' => true ], 'names' ); | |
| $ordered = array_unique( array_merge( $priority_order, $all ) ); | |
| usort( $ordered, function ( $a, $b ) use ( $priority_order ) { | |
| $ai = array_search( $a, $priority_order ); | |
| $bi = array_search( $b, $priority_order ); | |
| return ( $ai !== false ? $ai : PHP_INT_MAX ) - ( $bi !== false ? $bi : PHP_INT_MAX ); | |
| } ); | |
| return $ordered; | |
| } | |
| /** | |
| * Generate and output the llms.txt body. | |
| */ | |
| function lr_llms_generate_output_body( | |
| int $limit, | |
| int $page, | |
| int $offset, | |
| ?string $since, | |
| ?string $tag, | |
| ?string $lang, | |
| string $cache_key | |
| ): void { | |
| global $lr_llms_config; | |
| $has_more = false; | |
| // Shared exclusions | |
| $excluded_types = lr_llms_get_excluded_types(); | |
| $excluded_ids = lr_llms_get_excluded_ids(); | |
| $excluded_templates = lr_llms_get_excluded_templates(); | |
| $show_headings = (string) get_option( $lr_llms_config['setting_key_show_headings'], '0' ) === '1'; | |
| $show_descriptions = (string) get_option( $lr_llms_config['setting_key_show_descriptions'], '0' ) === '1'; | |
| $post_types = array_diff( lr_llms_get_ordered_post_types(), $excluded_types ); | |
| // Language detection | |
| if ( function_exists( 'pll_languages_list' ) ) { | |
| $languages = pll_languages_list(); | |
| $lang_type = 'polylang'; | |
| } elseif ( function_exists( 'icl_get_languages' ) ) { | |
| $wpml_langs = apply_filters( 'wpml_active_languages', null, [ 'skip_missing' => 0 ] ); | |
| $languages = array_keys( $wpml_langs ?: [] ); | |
| $lang_type = 'wpml'; | |
| } else { | |
| $languages = [ null ]; | |
| $lang_type = null; | |
| } | |
| if ( $lang && in_array( $lang, $languages, true ) ) { | |
| $languages = [ $lang ]; | |
| } | |
| if ( $lang_type === null ) { | |
| $locale = get_locale(); | |
| $default_lang = strstr( $locale, '_', true ) ?: $locale; | |
| } else { | |
| $default_lang = function_exists( 'pll_default_language' ) | |
| ? pll_default_language() | |
| : ( defined( 'LR_DEFAULT_LANGUAGE' ) ? LR_DEFAULT_LANGUAGE : 'en' ); | |
| } | |
| ob_start(); | |
| echo '# LLMs.txt — Generated by ' . esc_html( get_bloginfo( 'name' ) ) . "\n"; | |
| echo '# Site: ' . esc_url( home_url() ) . "\n"; | |
| echo '# Updated: ' . esc_html( current_time( 'c' ) ) . "\n"; | |
| echo '# Page: ' . esc_html( "{$page} / Per-Type Limit: {$limit}" ) . "\n"; | |
| echo '# Purpose: Lists public, indexable content for LLM indexing and retrieval.' . "\n"; | |
| echo '# Customize: WP Admin > Tools > LR LLMs Settings' . "\n\n"; | |
| $contact = apply_filters( $lr_llms_config['filter_contact_details'], '' ); | |
| if ( ! empty( $contact ) ) { | |
| echo $contact; | |
| } | |
| echo "## Sitemap\n\n"; | |
| echo '- XML: ' . esc_url( home_url( '/sitemap.xml' ) ) . "\n"; | |
| foreach ( $languages as $lang ) { | |
| $display_lang = $lang ?: $default_lang; | |
| echo "\n## Language: " . strtoupper( $display_lang ) . "\n"; | |
| foreach ( $post_types as $type ) { | |
| $type_obj = get_post_type_object( $type ); | |
| if ( ! $type_obj ) { | |
| continue; | |
| } | |
| if ( $show_headings ) { | |
| $label = apply_filters( $lr_llms_config['filter_post_type_label'], $type_obj->labels->name, $type, $lang ); | |
| echo "\n### {$label}\n\n"; | |
| } | |
| $args = [ | |
| 'post_type' => $type, | |
| 'post_status' => 'publish', | |
| 'has_password' => false, | |
| 'posts_per_page' => $limit, | |
| 'offset' => $offset, | |
| 'orderby' => 'date', | |
| 'order' => 'DESC', | |
| 'no_found_rows' => true, | |
| 'fields' => 'ids', | |
| 'update_post_meta_cache' => false, | |
| 'update_post_term_cache' => false, | |
| 'ignore_sticky_posts' => ( $type === 'post' ), | |
| 'suppress_filters' => false, | |
| ]; | |
| if ( $since ) { | |
| $args['date_query'] = [ [ | |
| 'after' => $since, | |
| 'inclusive' => true, | |
| 'column' => 'post_date_gmt', | |
| ] ]; | |
| } | |
| if ( $tag ) { | |
| $args['tag'] = $tag; | |
| } | |
| if ( $lang_type && $lang ) { | |
| $args['lang'] = $lang; | |
| } | |
| $query = new WP_Query( $args ); | |
| // Probe for next page | |
| $probe_args = $args; | |
| $probe_args['posts_per_page'] = 1; | |
| $probe_args['offset'] = $offset + $limit; | |
| $probe = new WP_Query( $probe_args ); | |
| if ( ! empty( $probe->posts ) ) { | |
| $has_more = true; | |
| } | |
| $ids = array_diff( $query->posts, $excluded_ids ); | |
| // Filter by excluded page templates | |
| if ( ! empty( $excluded_templates ) ) { | |
| $ids = array_filter( $ids, function ( $post_id ) use ( $excluded_templates ) { | |
| $tpl = get_page_template_slug( $post_id ); | |
| return ! in_array( $tpl, $excluded_templates, true ); | |
| } ); | |
| } | |
| $ids = apply_filters( $lr_llms_config['filter_included_post_ids'], $ids, $args, $lang ); | |
| $md_active = lr_llms_md_is_enabled(); | |
| foreach ( $ids as $post_id ) { | |
| $title = apply_filters( $lr_llms_config['filter_post_title'], lr_llms_clean_text( get_post_field( 'post_title', $post_id, 'raw' ) ?: '(untitled)' ), $post_id, $lang ); | |
| $permalink = get_permalink( $post_id ); | |
| // Homepages must become index.md — detect by matching home_url() exactly | |
| // e.g. https://site.com/ → index.md, https://site.com/en/ → en/index.md | |
| $home_url = trailingslashit( home_url() ); | |
| $is_home = trailingslashit( $permalink ) === $home_url; | |
| // Language-prefixed homepages: https://site.com/en/ | |
| if ( ! $is_home && function_exists( 'pll_languages_list' ) ) { | |
| foreach ( pll_languages_list() as $l ) { | |
| if ( trailingslashit( $permalink ) === trailingslashit( home_url( '/' . $l ) ) ) { | |
| $is_home = true; | |
| break; | |
| } | |
| } | |
| } | |
| $url = $md_active | |
| ? ( $is_home | |
| ? trailingslashit( $permalink ) . 'index.md' | |
| : rtrim( $permalink, '/' ) . '.md' ) | |
| : $permalink; | |
| $url = esc_url_raw( apply_filters( $lr_llms_config['filter_post_url'], $url, $post_id, $lang ) ); | |
| $last_modified = get_the_modified_time( 'c', $post_id ); | |
| echo '- [' . $title . '](' . $url . ')' . "\n"; | |
| echo 'Last-Modified: ' . $last_modified . "\n"; | |
| if ( $show_descriptions ) { | |
| $description = apply_filters( $lr_llms_config['filter_post_description'], lr_llms_clean_text( get_the_excerpt( $post_id ) ), $post_id ); | |
| if ( ! empty( $description ) ) { | |
| echo 'Description: ' . $description . "\n"; | |
| } | |
| } | |
| } | |
| wp_reset_postdata(); | |
| } | |
| } | |
| $output = ob_get_clean(); | |
| if ( $has_more ) { | |
| $base = lr_llms_build_base_url( $lr_llms_config ); | |
| $get_args = $_GET ?? []; | |
| unset( $get_args['page'], $get_args['paged'] ); | |
| $output .= "\nNext Page: " . add_query_arg( array_merge( $get_args, [ 'page' => $page + 1 ] ), $base ) . "\n"; | |
| } | |
| set_transient( $cache_key, $output, $lr_llms_config['cache_timeout'] ); | |
| echo $output; | |
| } | |
| // Cron purge | |
| add_action( 'init', function () { | |
| global $lr_llms_config, $wpdb; | |
| add_action( $lr_llms_config['cron_event'], function () use ( $lr_llms_config, $wpdb ) { | |
| $val_like = $wpdb->esc_like( '_transient_' . $lr_llms_config['cache_prefix'] ) . '%'; | |
| $time_like = $wpdb->esc_like( '_transient_timeout_' . $lr_llms_config['cache_prefix'] ) . '%'; | |
| $wpdb->query( $wpdb->prepare( "DELETE FROM {$wpdb->options} WHERE option_name LIKE %s", $val_like ) ); | |
| $wpdb->query( $wpdb->prepare( "DELETE FROM {$wpdb->options} WHERE option_name LIKE %s", $time_like ) ); | |
| } ); | |
| if ( ! wp_next_scheduled( $lr_llms_config['cron_event'] ) ) { | |
| wp_schedule_event( time(), $lr_llms_config['cron_frequency'], $lr_llms_config['cron_event'] ); | |
| } | |
| if ( $lr_llms_config['dev_purge_enabled'] ) { | |
| $timestamp = wp_next_scheduled( $lr_llms_config['cron_event'] ); | |
| if ( $timestamp ) { | |
| wp_unschedule_event( $timestamp, $lr_llms_config['cron_event'] ); | |
| error_log( '[LR LLMs] Purged scheduled event: ' . $lr_llms_config['cron_event'] ); | |
| } | |
| } | |
| } ); | |
| // ============================================================================= | |
| // F. MD ENDPOINT — CORE | |
| // ============================================================================= | |
| function lr_llms_md_is_enabled(): bool { | |
| return (string) get_option( LR_LLMS_MD_OPT_ENABLED, '1' ) === '1'; | |
| } | |
| function lr_llms_md_get_skip_classes(): array { | |
| return apply_filters( 'lr_llms_md_skip_classes', | |
| lr_llms_parse_lines( get_option( LR_LLMS_MD_OPT_SKIP_CLASSES, LR_LLMS_MD_DEFAULT_SKIP ) ) | |
| ); | |
| } | |
| function lr_llms_md_get_include_classes(): array { | |
| return apply_filters( 'lr_llms_md_include_classes', | |
| lr_llms_parse_lines( get_option( LR_LLMS_MD_OPT_INC_CLASSES, LR_LLMS_MD_DEFAULT_INC ) ) | |
| ); | |
| } | |
| /** | |
| * Resolve a .md request URI to a post ID. | |
| * Handles trailing slashes and subdirectory installs. | |
| */ | |
| function lr_llms_md_resolve_post_id(): int { | |
| $path = parse_url( $_SERVER['REQUEST_URI'] ?? '', PHP_URL_PATH ); | |
| if ( ! is_string( $path ) ) { | |
| return 0; | |
| } | |
| $path = rtrim( $path, '/' ); | |
| // Strip subdirectory prefix | |
| $home_path = rtrim( parse_url( home_url(), PHP_URL_PATH ) ?: '', '/' ); | |
| $relative = ( $home_path !== '' && str_starts_with( $path, $home_path ) ) | |
| ? substr( $path, strlen( $home_path ) ) | |
| : $path; | |
| if ( substr( $relative, -3 ) !== '.md' ) { | |
| return 0; | |
| } | |
| // Homepage alias — handles /index.md and /{lang}/index.md | |
| if ( preg_match( '#^/?(?:[a-z]{2}/)?index\.md$#i', ltrim( $relative, '/' ) ) ) { | |
| $page_on_front = (int) get_option( 'page_on_front' ); | |
| return $page_on_front > 0 ? $page_on_front : 0; | |
| } | |
| $real_relative = trailingslashit( substr( $relative, 0, -3 ) ); | |
| return (int) url_to_postid( home_url( $real_relative ) ); | |
| } | |
| /** | |
| * Build the frontend URL from the requested .md path, | |
| * preserving language prefixes like /en/. | |
| */ | |
| function lr_llms_md_get_requested_content_url(): string { | |
| $path = parse_url( $_SERVER['REQUEST_URI'] ?? '', PHP_URL_PATH ); | |
| if ( ! is_string( $path ) ) { | |
| return home_url( '/' ); | |
| } | |
| $path = rtrim( $path, '/' ); | |
| // Strip WP subdirectory if installed in one | |
| $home_path = rtrim( parse_url( home_url(), PHP_URL_PATH ) ?: '', '/' ); | |
| if ( $home_path && str_starts_with( $path, $home_path ) ) { | |
| $path = substr( $path, strlen( $home_path ) ); | |
| } | |
| // Root homepage alias: /index.md | |
| if ( $path === '/index.md' || $path === 'index.md' ) { | |
| return home_url( '/' ); | |
| } | |
| // Language-prefixed homepage alias: /en/index.md → /en/ | |
| if ( str_ends_with( $path, '/index.md' ) ) { | |
| return home_url( substr( $path, 0, -8 ) ); // strip 'index.md', dir already has leading slash | |
| } | |
| // Strip .md and append trailing slash | |
| $path = preg_replace( '/\.md$/', '', $path ); | |
| return home_url( trailingslashit( $path ) ); | |
| } | |
| /** | |
| * Intercept .md requests. | |
| */ | |
| function lr_llms_md_handle_request(): void { | |
| if ( is_admin() || ! lr_llms_md_is_enabled() ) { | |
| return; | |
| } | |
| $post_id = lr_llms_md_resolve_post_id(); | |
| if ( $post_id === 0 ) { | |
| return; | |
| } | |
| $post = get_post( $post_id ); | |
| if ( ! $post || $post->post_status !== 'publish' || ! empty( $post->post_password ) ) { | |
| return; // Let WordPress render its own 404 template | |
| } | |
| // Shared exclusions | |
| if ( in_array( $post->post_type, lr_llms_get_excluded_types(), true ) ) { | |
| return; | |
| } | |
| if ( in_array( $post_id, lr_llms_get_excluded_ids(), true ) ) { | |
| return; | |
| } | |
| // Template exclusions | |
| $excluded_templates = lr_llms_get_excluded_templates(); | |
| if ( ! empty( $excluded_templates ) && in_array( get_page_template_slug( $post_id ), $excluded_templates, true ) ) { | |
| return; | |
| } | |
| lr_llms_md_serve( $post ); | |
| exit; | |
| } | |
| add_action( 'init', 'lr_llms_md_handle_request', 98 ); | |
| /** | |
| * Serve the .md response. | |
| */ | |
| function lr_llms_md_serve( WP_Post $post ): void { | |
| $post_id = $post->ID; | |
| $canonical_url = lr_llms_md_get_requested_content_url(); | |
| // Cache key auto-invalidates when post_modified changes | |
| $cache_key = 'lr_llms_md_' . $post_id . '_' . md5( $post->post_modified ); | |
| $md = get_transient( $cache_key ); | |
| if ( $md === false ) { | |
| $md = lr_llms_md_generate( $post_id, $canonical_url ); | |
| set_transient( $cache_key, $md, HOUR_IN_SECONDS * 24 ); | |
| } | |
| $md = apply_filters( 'lr_llms_md_output', $md, $post_id ); | |
| status_header( 200 ); | |
| header( 'Content-Type: text/plain; charset=utf-8' ); | |
| header( 'X-Robots-Tag: noindex' ); | |
| header( 'Cache-Control: no-store, must-revalidate' ); | |
| header( 'Link: <' . esc_url_raw( $canonical_url ) . '>; rel="canonical"' ); | |
| echo $md; | |
| } | |
| /** | |
| * Fetch rendered HTML and generate Markdown for a post. | |
| */ | |
| function lr_llms_md_generate( int $post_id, string $url ): string { | |
| $post = get_post( $post_id ); | |
| $args = [ | |
| 'timeout' => 15, | |
| // Identify the internal self-request so WAFs and access logs don't | |
| // misinterpret it as an external scraper or trigger Cloudflare rules. | |
| 'user-agent' => 'LR-LLMs-Markdown-Extractor/1.0 (WordPress; +' . home_url() . ')', | |
| ]; | |
| if ( function_exists( 'lr_is_localhost' ) && lr_is_localhost() ) { | |
| $args['sslverify'] = false; | |
| } | |
| $response = wp_remote_get( $url, $args ); | |
| if ( is_wp_error( $response ) ) { | |
| error_log( '[lr-llms-md] HTTP fetch failed for post ' . $post_id . ': ' . $response->get_error_message() ); | |
| $html = '<main>' . apply_filters( 'the_content', $post->post_content ) . '</main>'; | |
| } else { | |
| $html = wp_remote_retrieve_body( $response ); | |
| } | |
| /** | |
| * Filter the raw HTML before it is passed to the Markdown converter. | |
| * | |
| * Use this to strip site-specific elements (hero blocks, ad slots, widgets) | |
| * that are not covered by the skip-classes list, or to pre-process markup | |
| * before DOM parsing begins. | |
| * | |
| * @param string $html Full rendered HTML of the page. | |
| * @param int $post_id | |
| */ | |
| $html = apply_filters( 'lr_llms_md_html_source', $html, $post_id ); | |
| // Build YAML-style frontmatter — recognised by Jekyll, Hugo, and many LLM tools. | |
| $title = lr_llms_clean_text( get_post_field( 'post_title', $post_id, 'raw' ) ); | |
| $last_modified = get_the_modified_time( 'c', $post_id ); | |
| $post_type = get_post_type( $post_id ); | |
| $excerpt = lr_llms_clean_text( wp_strip_all_tags( get_the_excerpt( $post_id ) ) ); | |
| $yaml_str = function ( string $val ): string { | |
| return '"' . str_replace( [ '\\', '"' ], [ '\\\\', '\\"' ], $val ) . '"'; | |
| }; | |
| $frontmatter = [ '---' ]; | |
| $frontmatter[] = 'title: ' . $yaml_str( $title ); | |
| $frontmatter[] = 'url: ' . $url; | |
| $frontmatter[] = 'last_modified: ' . $last_modified; | |
| $frontmatter[] = 'type: ' . ( $post_type ?: 'page' ); | |
| if ( $excerpt ) { | |
| $frontmatter[] = 'description: ' . $yaml_str( $excerpt ); | |
| } | |
| $frontmatter[] = '---'; | |
| $lines = []; | |
| $lines[] = implode( "\n", $frontmatter ); | |
| $lines[] = ''; | |
| $lines[] = lr_llms_md_html_to_markdown( $html ); | |
| return implode( "\n", $lines ); | |
| } | |
| // ============================================================================= | |
| // G. HTML → MARKDOWN CONVERTER | |
| // ============================================================================= | |
| function lr_llms_md_html_to_markdown( string $html ): string { | |
| $skip_classes = lr_llms_md_get_skip_classes(); | |
| $include_classes = lr_llms_md_get_include_classes(); | |
| $use_lazy_load = (string) get_option( LR_LLMS_MD_OPT_LAZY_LOAD, '1' ) === '1'; | |
| $dedup_links = (string) get_option( LR_LLMS_MD_OPT_DEDUP_LINKS, '1' ) === '1'; | |
| // 'form' intentionally excluded — form containers often wrap meaningful text content | |
| // (e.g. event registration sections with headings and instructions). | |
| // Form controls are suppressed separately via $skip_form_tags. | |
| $skip_tags = apply_filters( 'lr_llms_md_skip_tags', [ | |
| 'script', 'style', 'noscript', 'nav', 'header', 'footer', 'iframe', 'svg', | |
| ] ); | |
| $skip_form_tags = [ 'input', 'textarea', 'select', 'option', 'button', 'label', 'fieldset', 'datalist' ]; | |
| // Normalize self-closing block elements — e.g. <div class="foo" /> produced by some | |
| // page builders / ACF renderers. These are valid XHTML but break libxml's HTML parser: | |
| // it treats them as open tags and then mismatches subsequent closing tags, corrupting | |
| // the entire DOM tree. Convert to explicit open+close pairs before parsing. | |
| $html = preg_replace( | |
| '#<(div|section|article|aside|header|footer|nav|span|picture)([^>]*)/>#i', | |
| '<$1$2></$1>', | |
| $html | |
| ); | |
| $dom = new DOMDocument(); | |
| libxml_use_internal_errors( true ); | |
| $dom->loadHTML( '<?xml encoding="utf-8" ?>' . $html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD ); | |
| libxml_clear_errors(); | |
| $xpath = new DOMXPath( $dom ); | |
| $root = $xpath->query( '//main' )->item( 0 ) ?? $xpath->query( '//body' )->item( 0 ); | |
| if ( ! $root ) { | |
| return wp_strip_all_tags( $html ); | |
| } | |
| $get_classes = function ( DOMElement $el ): array { | |
| return array_filter( array_map( 'trim', preg_split( '/\s+/', $el->getAttribute( 'class' ) ) ) ); | |
| }; | |
| $should_skip = function ( DOMNode $node ) use ( $skip_classes, $skip_tags, $skip_form_tags, $get_classes ): bool { | |
| if ( ! $node instanceof DOMElement ) return false; | |
| if ( in_array( strtolower( $node->tagName ), $skip_tags, true ) ) return true; | |
| if ( in_array( strtolower( $node->tagName ), $skip_form_tags, true ) ) return true; | |
| $current = $node; | |
| while ( $current instanceof DOMElement ) { | |
| if ( array_intersect( $skip_classes, $get_classes( $current ) ) ) return true; | |
| $current = $current->parentNode instanceof DOMElement ? $current->parentNode : null; | |
| } | |
| return false; | |
| }; | |
| $seen_hrefs = []; | |
| $convert = null; | |
| $children_md = function ( DOMNode $node ) use ( &$convert ): string { | |
| $out = ''; | |
| foreach ( $node->childNodes as $child ) { | |
| $out .= $convert( $child ); | |
| } | |
| return $out; | |
| }; | |
| $convert = function ( DOMNode $node ) use ( | |
| &$convert, &$children_md, &$seen_hrefs, | |
| $should_skip, $include_classes, $get_classes, | |
| $use_lazy_load, $dedup_links | |
| ): string { | |
| if ( $node instanceof DOMText ) { | |
| $text = preg_replace( '/\s+/', ' ', $node->nodeValue ); | |
| if ( ! empty( $include_classes ) ) { | |
| $parent = $node->parentNode; | |
| $included = false; | |
| while ( $parent instanceof DOMElement ) { | |
| if ( array_intersect( $include_classes, $get_classes( $parent ) ) ) { | |
| $included = true; | |
| break; | |
| } | |
| $parent = $parent->parentNode; | |
| } | |
| if ( ! $included ) return ''; | |
| } | |
| return $text; | |
| } | |
| if ( ! $node instanceof DOMElement ) return ''; | |
| if ( $should_skip( $node ) ) return ''; | |
| $tag = strtolower( $node->tagName ); | |
| $inner = $children_md( $node ); | |
| $inner_t = trim( $inner ); | |
| switch ( $tag ) { | |
| case 'h1': $ht = trim( $node->textContent ); return $ht !== '' ? "\n\n# {$ht}\n\n" : ''; | |
| case 'h2': $ht = trim( $node->textContent ); return $ht !== '' ? "\n\n## {$ht}\n\n" : ''; | |
| case 'h3': $ht = trim( $node->textContent ); return $ht !== '' ? "\n\n### {$ht}\n\n" : ''; | |
| case 'h4': $ht = trim( $node->textContent ); return $ht !== '' ? "\n\n#### {$ht}\n\n" : ''; | |
| case 'h5': $ht = trim( $node->textContent ); return $ht !== '' ? "\n\n##### {$ht}\n\n" : ''; | |
| case 'h6': $ht = trim( $node->textContent ); return $ht !== '' ? "\n\n###### {$ht}\n\n" : ''; | |
| case 'p': | |
| return $inner_t !== '' ? "\n\n{$inner_t}\n\n" : ''; | |
| case 'blockquote': | |
| if ( $inner_t === '' ) return ''; | |
| return "\n\n" . implode( "\n", array_map( fn( $l ) => '> ' . $l, explode( "\n", trim( $inner ) ) ) ) . "\n\n"; | |
| case 'pre': | |
| return "\n\n```\n" . $node->textContent . "\n```\n\n"; | |
| case 'code': | |
| if ( $node->parentNode instanceof DOMElement && strtolower( $node->parentNode->tagName ) === 'pre' ) return $inner; | |
| return $inner_t !== '' ? "`{$inner_t}`" : ''; | |
| case 'strong': | |
| case 'b': | |
| return $inner_t !== '' ? "**{$inner_t}**" : ''; | |
| case 'em': | |
| case 'i': | |
| return $inner_t !== '' ? "_{$inner_t}_" : ''; | |
| case 'a': { | |
| $href = trim( $node->getAttribute( 'href' ) ); | |
| // Non-content hrefs — output text only | |
| if ( | |
| $href === '' || $href === '#' || | |
| str_starts_with( $href, 'javascript:' ) || | |
| str_starts_with( $href, 'mailto:' ) || | |
| str_starts_with( $href, 'tel:' ) | |
| ) { | |
| return $inner_t; | |
| } | |
| if ( $dedup_links ) { | |
| if ( isset( $seen_hrefs[ $href ] ) ) return $inner_t; | |
| $seen_hrefs[ $href ] = true; | |
| } | |
| if ( $inner_t === '' || $inner_t === $href ) return $href; | |
| return "[{$inner_t}]({$href})"; | |
| } | |
| case 'img': { | |
| $alt = trim( $node->getAttribute( 'alt' ) ); | |
| if ( $alt === '' ) return ''; | |
| $src = trim( $node->getAttribute( 'src' ) ); | |
| if ( $use_lazy_load ) { | |
| foreach ( [ 'data-src', 'data-lazy-src', 'data-lazy', 'data-original' ] as $attr ) { | |
| $lazy = trim( $node->getAttribute( $attr ) ); | |
| if ( $lazy !== '' && ! str_starts_with( $lazy, 'data:' ) ) { | |
| $src = $lazy; | |
| break; | |
| } | |
| } | |
| } | |
| if ( $src === '' || str_starts_with( $src, 'data:' ) ) return ''; | |
| // SVGs are always decorative icons in LR sites — never content images | |
| if ( strtolower( pathinfo( parse_url( $src, PHP_URL_PATH ), PATHINFO_EXTENSION ) ) === 'svg' ) return ''; | |
| return "\n\n\n\n"; | |
| } | |
| case 'ul': { | |
| $items = ''; | |
| foreach ( $node->childNodes as $child ) { | |
| if ( $child instanceof DOMElement && strtolower( $child->tagName ) === 'li' ) { | |
| $li = trim( $children_md( $child ) ); | |
| if ( $li !== '' ) $items .= "- {$li}\n"; | |
| } | |
| } | |
| return $items !== '' ? "\n\n{$items}\n" : ''; | |
| } | |
| case 'ol': { | |
| $items = ''; | |
| $i = 1; | |
| foreach ( $node->childNodes as $child ) { | |
| if ( $child instanceof DOMElement && strtolower( $child->tagName ) === 'li' ) { | |
| $li = trim( $children_md( $child ) ); | |
| if ( $li !== '' ) { $items .= "{$i}. {$li}\n"; $i++; } | |
| } | |
| } | |
| return $items !== '' ? "\n\n{$items}\n" : ''; | |
| } | |
| case 'li': return $children_md( $node ); | |
| case 'hr': return "\n\n---\n\n"; | |
| case 'br': return "\n\n"; // soft breaks collapsed to paragraphs — " \n" trailing spaces are eaten by trim() | |
| case 'table': return "\n\n" . $inner . "\n\n"; | |
| case 'tr': return $inner . "\n"; | |
| case 'td': | |
| case 'th': return $inner . ' '; | |
| default: return $inner; | |
| } | |
| }; | |
| $raw = $convert( $root ); | |
| $lines = explode( "\n", $raw ); | |
| $out = []; | |
| $in_code = false; | |
| $prev_blank = false; | |
| foreach ( $lines as $line ) { | |
| if ( str_starts_with( trim( $line ), '```' ) ) { | |
| $in_code = ! $in_code; | |
| } | |
| if ( $in_code ) { | |
| $out[] = $line; | |
| $prev_blank = false; | |
| continue; | |
| } | |
| $trimmed = trim( $line ); | |
| $is_blank = ( $trimmed === '' ); | |
| if ( $is_blank && $prev_blank ) continue; | |
| $out[] = $trimmed; | |
| $prev_blank = $is_blank; | |
| } | |
| return trim( implode( "\n", $out ) ); | |
| } | |
| // ============================================================================= | |
| // H. ADMIN — UNIFIED SETTINGS PAGE | |
| // ============================================================================= | |
| /** | |
| * Register single settings page under Tools. | |
| */ | |
| add_action( 'admin_menu', function () { | |
| $basename = basename( $_SERVER['PHP_SELF'] ?? '' ); | |
| if ( in_array( $basename, [ 'post.php', 'post-new.php', 'edit.php' ], true ) ) { | |
| return; | |
| } | |
| add_management_page( | |
| 'LR LLMs Settings', | |
| 'LR LLMs Settings', | |
| 'manage_options', | |
| 'lr-llms-settings', | |
| 'lr_llms_render_settings_page' | |
| ); | |
| } ); | |
| /** | |
| * Save all settings. | |
| */ | |
| add_action( 'admin_init', function () { | |
| if ( | |
| ! current_user_can( 'manage_options' ) || | |
| ! isset( $_POST['lr_llms_save'] ) || | |
| ! check_admin_referer( 'lr_llms_action', 'lr_llms_nonce' ) | |
| ) { | |
| return; | |
| } | |
| global $lr_llms_config; | |
| // Shared exclusions | |
| update_option( $lr_llms_config['setting_key_exclude_types'], | |
| array_map( 'sanitize_key', $_POST['lr_llms_exclude_post_types'] ?? [] ) | |
| ); | |
| update_option( $lr_llms_config['setting_key_exclude_ids'], | |
| sanitize_text_field( $_POST['lr_llms_exclude_ids'] ?? '' ) | |
| ); | |
| update_option( $lr_llms_config['setting_key_exclude_templates'], | |
| array_map( 'sanitize_text_field', $_POST['lr_llms_exclude_templates'] ?? [] ) | |
| ); | |
| // llms.txt display | |
| update_option( $lr_llms_config['setting_key_show_headings'], isset( $_POST['lr_llms_show_headings'] ) ? '1' : '0' ); | |
| update_option( $lr_llms_config['setting_key_show_descriptions'], isset( $_POST['lr_llms_show_descriptions'] ) ? '1' : '0' ); | |
| // .md settings | |
| update_option( LR_LLMS_MD_OPT_ENABLED, isset( $_POST['lr_llms_md_enabled'] ) ? '1' : '0' ); | |
| update_option( LR_LLMS_MD_OPT_LAZY_LOAD, isset( $_POST['lr_llms_md_lazy_load'] ) ? '1' : '0' ); | |
| update_option( LR_LLMS_MD_OPT_DEDUP_LINKS, isset( $_POST['lr_llms_md_dedup_links'] ) ? '1' : '0' ); | |
| update_option( LR_LLMS_MD_OPT_SKIP_CLASSES, sanitize_textarea_field( $_POST['lr_llms_md_skip_classes'] ?? '' ) ); | |
| update_option( LR_LLMS_MD_OPT_INC_CLASSES, sanitize_textarea_field( $_POST['lr_llms_md_include_classes'] ?? '' ) ); | |
| lr_llms_flush_all_caches(); | |
| wp_safe_redirect( add_query_arg( 'lr_llms_saved', '1', admin_url( 'tools.php?page=lr-llms-settings' ) ) ); | |
| exit; | |
| } ); | |
| /** | |
| * Flush all caches handler. | |
| */ | |
| add_action( 'admin_init', function () { | |
| if ( | |
| ! current_user_can( 'manage_options' ) || | |
| ! isset( $_POST['lr_llms_flush'] ) || | |
| ! check_admin_referer( 'lr_llms_action', 'lr_llms_nonce' ) | |
| ) { | |
| return; | |
| } | |
| lr_llms_flush_all_caches(); | |
| wp_safe_redirect( add_query_arg( 'lr_llms_flushed', '1', admin_url( 'tools.php?page=lr-llms-settings' ) ) ); | |
| exit; | |
| } ); | |
| /** | |
| * Render the unified settings page. | |
| */ | |
| function lr_llms_render_settings_page(): void { | |
| if ( ! current_user_can( 'manage_options' ) ) { | |
| return; | |
| } | |
| global $lr_llms_config; | |
| // Load all current values | |
| $excluded_types = lr_llms_get_excluded_types(); | |
| $excluded_ids = get_option( $lr_llms_config['setting_key_exclude_ids'], '' ); | |
| $excluded_templates = lr_llms_get_excluded_templates(); | |
| $show_headings = (string) get_option( $lr_llms_config['setting_key_show_headings'], '0' ) === '1'; | |
| $show_descriptions = (string) get_option( $lr_llms_config['setting_key_show_descriptions'], '0' ) === '1'; | |
| $md_enabled = (string) get_option( LR_LLMS_MD_OPT_ENABLED, '1' ) === '1'; | |
| $md_lazy_load = (string) get_option( LR_LLMS_MD_OPT_LAZY_LOAD, '1' ) === '1'; | |
| $md_dedup_links = (string) get_option( LR_LLMS_MD_OPT_DEDUP_LINKS, '1' ) === '1'; | |
| $md_skip_classes = get_option( LR_LLMS_MD_OPT_SKIP_CLASSES, LR_LLMS_MD_DEFAULT_SKIP ); | |
| $md_inc_classes = get_option( LR_LLMS_MD_OPT_INC_CLASSES, LR_LLMS_MD_DEFAULT_INC ); | |
| $all_post_types = get_post_types( [ 'public' => true ], 'objects' ); | |
| $all_templates = lr_llms_get_all_page_templates(); | |
| $saved = isset( $_GET['lr_llms_saved'] ); | |
| $flushed = isset( $_GET['lr_llms_flushed'] ); | |
| ?> | |
| <div class="wrap"> | |
| <h1>LR LLMs Settings</h1> | |
| <?php if ( $saved ) : ?> | |
| <div class="notice notice-success is-dismissible"><p>Settings saved and cache flushed.</p></div> | |
| <?php endif; ?> | |
| <?php if ( $flushed ) : ?> | |
| <div class="notice notice-success is-dismissible"><p>All caches flushed.</p></div> | |
| <?php endif; ?> | |
| <?php /* ============================================================ | |
| SECTION 1 — LLMs.txt | |
| ============================================================ */ ?> | |
| <h2 class="title" style="margin-top:1.5em;">LLMs.txt</h2> | |
| <p class="description" style="max-width:740px;"> | |
| Generates a dynamic <code>llms.txt</code> index for LLM crawlers. | |
| Paginated, multilingual (Polylang & WPML), and customizable via filters. | |
| </p> | |
| <table class="widefat" style="max-width:740px; margin:1em 0; border-collapse:collapse;"> | |
| <tbody> | |
| <tr> | |
| <td style="padding:8px 12px; width:160px; font-weight:600; color:#3c434a;">Public URL</td> | |
| <td style="padding:8px 12px;"> | |
| <a href="<?php echo esc_url( home_url( '/llms.txt' ) ); ?>" target="_blank" rel="noopener"> | |
| <code><?php echo esc_html( home_url( '/llms.txt' ) ); ?></code> | |
| </a> | |
| </td> | |
| </tr> | |
| <tr style="background:#f9f9f9;"> | |
| <td style="padding:8px 12px; font-weight:600; color:#3c434a;">Force refresh</td> | |
| <td style="padding:8px 12px;"> | |
| <code><?php echo esc_html( home_url( '/llms.txt?flush=1' ) ); ?></code> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td style="padding:8px 12px; font-weight:600; color:#3c434a;">robots.txt tip</td> | |
| <td style="padding:8px 12px;"> | |
| <code>LLMs: <?php echo esc_html( home_url( '/llms.txt' ) ); ?></code> | |
| </td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| <form method="post"> | |
| <?php wp_nonce_field( 'lr_llms_action', 'lr_llms_nonce' ); ?> | |
| <table class="form-table" role="presentation"> | |
| <tr> | |
| <th scope="row">Display options</th> | |
| <td> | |
| <label style="display:block; margin-bottom:6px;"> | |
| <input type="checkbox" name="lr_llms_show_headings" value="1" <?php checked( $show_headings ); ?>> | |
| Show section heading for each content type | |
| </label> | |
| <label style="display:block;"> | |
| <input type="checkbox" name="lr_llms_show_descriptions" value="1" <?php checked( $show_descriptions ); ?>> | |
| Show description (excerpt) for each entry | |
| </label> | |
| </td> | |
| </tr> | |
| </table> | |
| <?php /* ============================================================ | |
| SECTION 2 — .md Endpoint | |
| ============================================================ */ ?> | |
| <hr style="margin:2em 0; border:none; border-top:1px solid #dcdcde;"> | |
| <h2 class="title" style="display:flex; align-items:center; gap:10px;"> | |
| .md Endpoint | |
| <span style="font-size:12px; font-weight:400; padding:2px 8px; border-radius:3px; | |
| background:<?php echo $md_enabled ? '#d7f7c2' : '#f0f0f1'; ?>; | |
| color:<?php echo $md_enabled ? '#1a6b1a' : '#757575'; ?>;"> | |
| <?php echo $md_enabled ? 'Active' : 'Inactive'; ?> | |
| </span> | |
| </h2> | |
| <p class="description" style="max-width:740px;"> | |
| Appends a clean <code>.md</code> endpoint to every public post and page. | |
| AI crawlers can request the Markdown version directly — navigation, scripts, and layout markup stripped. | |
| Cache auto-invalidates on post save. | |
| </p> | |
| <table class="widefat" style="max-width:740px; margin:1em 0; border-collapse:collapse;"> | |
| <tbody> | |
| <tr> | |
| <td style="padding:8px 12px; width:160px; font-weight:600; color:#3c434a;">Example</td> | |
| <td style="padding:8px 12px;"> | |
| <a href="<?php echo esc_url( home_url( '/your-post-slug.md' ) ); ?>" target="_blank" rel="noopener"> | |
| <code><?php echo esc_html( home_url( '/your-post-slug.md' ) ); ?></code> | |
| </a> | |
| </td> | |
| </tr> | |
| <tr style="background:#f9f9f9;"> | |
| <td style="padding:8px 12px; font-weight:600; color:#3c434a;">Homepage</td> | |
| <td style="padding:8px 12px;"> | |
| <a href="<?php echo esc_url( home_url( '/index.md' ) ); ?>" target="_blank" rel="noopener"> | |
| <code><?php echo esc_html( home_url( '/index.md' ) ); ?></code> | |
| </a> | |
| <span class="description"> — requires a static front page</span> | |
| </td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| <table class="form-table" role="presentation"> | |
| <tr> | |
| <th scope="row">Status</th> | |
| <td> | |
| <label> | |
| <input type="checkbox" name="lr_llms_md_enabled" value="1" <?php checked( $md_enabled ); ?>> | |
| Enable <code>.md</code> endpoint | |
| </label> | |
| <p class="description">Uncheck to disable entirely. Cache is flushed on save.</p> | |
| </td> | |
| </tr> | |
| <tr> | |
| <th scope="row">Lazy-load images</th> | |
| <td> | |
| <label> | |
| <input type="checkbox" name="lr_llms_md_lazy_load" value="1" <?php checked( $md_lazy_load ); ?>> | |
| Resolve real image URL from <code>data-src</code> / <code>data-lazy-src</code> | |
| </label> | |
| <p class="description"> | |
| Enable if your theme lazy-loads images (lazySizes, etc.).<br> | |
| Checks <code>data-src</code>, <code>data-lazy-src</code>, <code>data-lazy</code>, <code>data-original</code> in that order. | |
| SVG placeholders and data URIs are always skipped. | |
| </p> | |
| </td> | |
| </tr> | |
| <tr> | |
| <th scope="row">Deduplicate links</th> | |
| <td> | |
| <label> | |
| <input type="checkbox" name="lr_llms_md_dedup_links" value="1" <?php checked( $md_dedup_links ); ?>> | |
| Output each unique URL only once per page | |
| </label> | |
| <p class="description"> | |
| Prevents repeated links when cards have multiple anchors pointing to the same URL | |
| (image + title + CTA all linking to the same post). | |
| </p> | |
| </td> | |
| </tr> | |
| <tr> | |
| <th scope="row" style="vertical-align:top; padding-top:14px;">Skip classes</th> | |
| <td> | |
| <textarea name="lr_llms_md_skip_classes" rows="10" class="large-text code" | |
| style="font-size:12px; line-height:1.7; max-width:500px;" | |
| ><?php echo esc_textarea( $md_skip_classes ); ?></textarea> | |
| <p class="description"> | |
| One CSS class per line. Elements with these classes — or nested inside them — are excluded from output.<br> | |
| Tip: add <code>lr-no-extract-md</code> to any ACF layout block in the editor to exclude it from <code>.md</code> output only. | |
| </p> | |
| </td> | |
| </tr> | |
| <tr> | |
| <th scope="row" style="vertical-align:top; padding-top:14px;"> | |
| Include classes | |
| <br><span style="font-weight:400; font-size:11px; color:#757575;">whitelist</span> | |
| </th> | |
| <td> | |
| <textarea name="lr_llms_md_include_classes" rows="5" class="large-text code" | |
| style="font-size:12px; line-height:1.7; max-width:500px;" | |
| placeholder="Leave empty to include all content inside <main>" | |
| ><?php echo esc_textarea( $md_inc_classes ); ?></textarea> | |
| <p class="description"> | |
| One CSS class per line. <strong>When set</strong>, only content inside these elements is extracted.<br> | |
| Default is <code>text-container</code> — the standard LR framework content wrapper.<br> | |
| Clear to extract everything inside <code><main></code>. | |
| </p> | |
| </td> | |
| </tr> | |
| </table> | |
| <?php /* ============================================================ | |
| SECTION 3 — Shared Exclusions | |
| ============================================================ */ ?> | |
| <hr style="margin:2em 0; border:none; border-top:1px solid #dcdcde;"> | |
| <h2 class="title">Exclusions</h2> | |
| <p class="description" style="max-width:740px;"> | |
| These settings apply to both <code>llms.txt</code> and the <code>.md</code> endpoint. | |
| </p> | |
| <table class="form-table" role="presentation"> | |
| <tr> | |
| <th scope="row" style="vertical-align:top; padding-top:8px;">Post types</th> | |
| <td> | |
| <?php foreach ( $all_post_types as $pt ) : ?> | |
| <label style="display:block; margin-bottom:4px;"> | |
| <input type="checkbox" | |
| name="lr_llms_exclude_post_types[]" | |
| value="<?php echo esc_attr( $pt->name ); ?>" | |
| <?php checked( in_array( $pt->name, $excluded_types, true ) ); ?>> | |
| <?php echo esc_html( sprintf( '%s (%s)', $pt->label, $pt->name ) ); ?> | |
| </label> | |
| <?php endforeach; ?> | |
| <p class="description" style="margin-top:8px;">Checked post types return 404 for <code>.md</code> and are omitted from <code>llms.txt</code>.</p> | |
| </td> | |
| </tr> | |
| <?php if ( ! empty( $all_templates ) ) : ?> | |
| <tr> | |
| <th scope="row" style="vertical-align:top; padding-top:8px;">Page templates</th> | |
| <td> | |
| <?php foreach ( $all_templates as $file => $name ) : ?> | |
| <label style="display:block; margin-bottom:4px;"> | |
| <input type="checkbox" | |
| name="lr_llms_exclude_templates[]" | |
| value="<?php echo esc_attr( $file ); ?>" | |
| <?php checked( in_array( $file, $excluded_templates, true ) ); ?>> | |
| <?php echo esc_html( $name ); ?> | |
| <span class="description">— <code><?php echo esc_html( $file ); ?></code></span> | |
| </label> | |
| <?php endforeach; ?> | |
| <p class="description" style="margin-top:8px;">Pages using these templates are excluded from both <code>llms.txt</code> and <code>.md</code>.</p> | |
| </td> | |
| </tr> | |
| <?php endif; ?> | |
| <tr> | |
| <th scope="row">Specific post IDs</th> | |
| <td> | |
| <input type="text" | |
| name="lr_llms_exclude_ids" | |
| value="<?php echo esc_attr( $excluded_ids ); ?>" | |
| class="regular-text" | |
| placeholder="e.g. 12, 45, 103"> | |
| <p class="description">Comma-separated post IDs. Excluded from both <code>llms.txt</code> and <code>.md</code>.</p> | |
| </td> | |
| </tr> | |
| </table> | |
| <p class="submit"> | |
| <input type="submit" name="lr_llms_save" class="button-primary" value="Save Settings"> | |
| </p> | |
| </form> | |
| <?php /* ============================================================ | |
| SECTION 4 — Cache | |
| ============================================================ */ ?> | |
| <hr style="margin:1em 0; border:none; border-top:1px solid #dcdcde;"> | |
| <h2 class="title">Cache</h2> | |
| <p class="description" style="max-width:740px; margin-bottom:1em;"> | |
| <code>llms.txt</code> cache expires in 1 hour and flushes automatically on post save.<br> | |
| <code>.md</code> cache is per-post, keyed by <code>post_modified</code> — auto-invalidates on save.<br> | |
| Use the button below after changing settings, skip/include classes, or template exclusions. | |
| </p> | |
| <form method="post"> | |
| <?php wp_nonce_field( 'lr_llms_action', 'lr_llms_nonce' ); ?> | |
| <input type="submit" name="lr_llms_flush" class="button-secondary" value="Flush All Caches"> | |
| </form> | |
| </div> | |
| <?php | |
| } | |
| // ============================================================================= | |
| // I. DEVELOPER REFERENCE — FILTERS & CONSTANTS | |
| // ============================================================================= | |
| // | |
| // ─── LLMS.TXT FILTERS ──────────────────────────────────────────────────────── | |
| // | |
| // lr_llms_contact_details | |
| // Inject a contact/about block at the top of llms.txt, after the header. | |
| // Return a string (must include its own trailing newline). | |
| // @param string $details Empty string by default. | |
| // @return string | |
| // | |
| // add_filter( 'lr_llms_contact_details', function ( $details ) { | |
| // return "Contact: hello@example.com\nTwitter: @example\n\n"; | |
| // } ); | |
| // | |
| // ───────────────────────────────────────────────────────────────────────────── | |
| // | |
| // lr_llms_post_type_priority_order | |
| // Override the order in which post types appear in llms.txt. | |
| // Types not in the array are appended in registration order. | |
| // @param array $order Default: ['page', 'post'] | |
| // @return array | |
| // | |
| // add_filter( 'lr_llms_post_type_priority_order', function ( $order ) { | |
| // return [ 'post', 'page', 'product' ]; | |
| // } ); | |
| // | |
| // ───────────────────────────────────────────────────────────────────────────── | |
| // | |
| // lr_llms_included_post_ids | |
| // Final filter on the array of post IDs included in llms.txt for a given | |
| // post type and language. Useful for adding, removing, or reordering IDs. | |
| // @param array $ids Array of post IDs about to be output. | |
| // @param array $args The WP_Query args used to fetch them. | |
| // @param string $lang Active language code, or null for monolingual sites. | |
| // @return array | |
| // | |
| // add_filter( 'lr_llms_included_post_ids', function ( $ids, $args, $lang ) { | |
| // // Remove a specific ID regardless of other settings | |
| // return array_diff( $ids, [ 99 ] ); | |
| // }, 10, 3 ); | |
| // | |
| // ───────────────────────────────────────────────────────────────────────────── | |
| // | |
| // lr_llms_post_type_label | |
| // Customise the section heading label for a post type when headings are on. | |
| // @param string $label Default: post type plural label. | |
| // @param string $type Post type slug. | |
| // @param string $lang Active language code. | |
| // @return string | |
| // | |
| // add_filter( 'lr_llms_post_type_label', function ( $label, $type, $lang ) { | |
| // if ( $type === 'product' ) return 'Our Products'; | |
| // return $label; | |
| // }, 10, 3 ); | |
| // | |
| // ───────────────────────────────────────────────────────────────────────────── | |
| // | |
| // lr_llms_post_title | |
| // Override the title string for a specific post in llms.txt. | |
| // @param string $title Current title. | |
| // @param int $post_id | |
| // @param string $lang | |
| // @return string | |
| // | |
| // add_filter( 'lr_llms_post_title', function ( $title, $post_id, $lang ) { | |
| // if ( $post_id === 42 ) return 'Custom Title for Post 42'; | |
| // return $title; | |
| // }, 10, 3 ); | |
| // | |
| // ───────────────────────────────────────────────────────────────────────────── | |
| // | |
| // lr_llms_post_url | |
| // Override the URL for a specific post in llms.txt. | |
| // @param string $url | |
| // @param int $post_id | |
| // @param string $lang | |
| // @return string | |
| // | |
| // add_filter( 'lr_llms_post_url', function ( $url, $post_id, $lang ) { | |
| // return $url; // e.g. swap to a CDN or canonical override | |
| // }, 10, 3 ); | |
| // | |
| // ───────────────────────────────────────────────────────────────────────────── | |
| // | |
| // lr_llms_post_description | |
| // Override the description string for a post when descriptions are enabled. | |
| // @param string $description Default: post excerpt. | |
| // @param int $post_id | |
| // @return string | |
| // | |
| // add_filter( 'lr_llms_post_description', function ( $description, $post_id ) { | |
| // return get_field( 'seo_summary', $post_id ) ?: $description; | |
| // }, 10, 2 ); | |
| // | |
| // | |
| // ─── MD ENDPOINT FILTERS ───────────────────────────────────────────────────── | |
| // | |
| // lr_llms_md_html_source | |
| // Filter the raw HTML before it is passed to the Markdown converter. | |
| // Runs after wp_remote_get / the_content fallback, before DOMDocument parsing. | |
| // Use this to strip site-specific elements not covered by skip-classes, or to | |
| // fix malformed markup before the DOM parser sees it. | |
| // @param string $html Full rendered HTML of the page. | |
| // @param int $post_id | |
| // @return string | |
| // | |
| // add_filter( 'lr_llms_md_html_source', function ( $html, $post_id ) { | |
| // // Strip a site-specific hero section by marker comment | |
| // $html = preg_replace( '/<!-- hero-start -->.*?<!-- hero-end -->/s', '', $html ); | |
| // return $html; | |
| // }, 10, 2 ); | |
| // | |
| // ───────────────────────────────────────────────────────────────────────────── | |
| // | |
| // lr_llms_md_output | |
| // Filter the final Markdown string before it is sent to the browser. | |
| // Runs after caching — result is NOT stored; fires on every request. | |
| // @param string $md Full Markdown output. | |
| // @param int $post_id | |
| // @return string | |
| // | |
| // add_filter( 'lr_llms_md_output', function ( $md, $post_id ) { | |
| // return $md . "\n\n---\nGenerated by Example Site\n"; | |
| // }, 10, 2 ); | |
| // | |
| // ───────────────────────────────────────────────────────────────────────────── | |
| // | |
| // lr_llms_md_skip_classes (array filter) | |
| // Programmatically extend the skip class list without touching the UI. | |
| // Merges with whatever is saved in settings. | |
| // @param array $classes | |
| // @return array | |
| // | |
| // add_filter( 'lr_llms_md_skip_classes', function ( $classes ) { | |
| // $classes[] = 'my-custom-nav'; | |
| // $classes[] = 'promo-banner'; | |
| // return $classes; | |
| // } ); | |
| // | |
| // Note: this filter must be wired to the getter. Current getter is: | |
| // function lr_llms_md_get_skip_classes(): array { | |
| // return lr_llms_parse_lines( get_option( LR_LLMS_MD_OPT_SKIP_CLASSES, LR_LLMS_MD_DEFAULT_SKIP ) ); | |
| // } | |
| // Add apply_filters( 'lr_llms_md_skip_classes', $result ) there if needed. | |
| // | |
| // | |
| // lr_llms_md_skip_tags (array filter) | |
| // Override the hardcoded HTML tag skip list for the Markdown converter. | |
| // Default: script, style, noscript, nav, header, footer, iframe, svg | |
| // Note: 'form' is intentionally NOT in the default list — form containers | |
| // often wrap meaningful text content (headings, instructions). Form control | |
| // elements (input, textarea, select, button etc.) are always skipped | |
| // separately and are not affected by this filter. | |
| // @param array $tags Lowercase tag names. | |
| // @return array | |
| // | |
| // add_filter( 'lr_llms_md_skip_tags', function ( $tags ) { | |
| // $tags[] = 'aside'; // skip sidebar asides | |
| // return $tags; | |
| // } ); | |
| // | |
| // | |
| // ─── WP-CONFIG CONSTANTS ───────────────────────────────────────────────────── | |
| // | |
| // LR_LLMS_DEV_PURGE (bool) | |
| // Unschedules the daily cron purge event. Useful during local development | |
| // to prevent unexpected transient cleanup. | |
| // define( 'LR_LLMS_DEV_PURGE', true ); | |
| // | |
| // LR_LLMS_DISABLE_RATE_LIMIT (bool) | |
| // Disables IP-based rate limiting on /llms.txt. Use in dev or CI only. | |
| // define( 'LR_LLMS_DISABLE_RATE_LIMIT', true ); | |
| // | |
| // LR_LLMS_LOCAL_PATH_SUFFIX (string) | |
| // Enables suffix-based matching for /llms.txt on localhost subdirectory | |
| // installs where the path includes the project slug. | |
| // define( 'LR_LLMS_LOCAL_PATH_SUFFIX', 'llms.txt' ); | |
| // Do NOT use in production. | |
| // | |
| // LR_DEFAULT_LANGUAGE (string) | |
| // Fallback language code when Polylang/WPML is not active. | |
| // define( 'LR_DEFAULT_LANGUAGE', 'en' ); | |
| // | |
| // ============================================================================= |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment