How a Browser Script Can Flood a Blog — The Archive.today Allegations Explained
Long-form explanation, simulation, timeline, community reaction, and source hub. Allegations are presented as reported by the linked sources.
What was observed
Public reporting and community threads document that the archive.today CAPTCHA page included a short client-side JavaScript pattern that repeatedly constructs and issues requests to a blog’s search endpoint at a fixed short interval. The original investigator published the exact code snippet they observed and logged the timeline, screenshots and correspondence. :contentReference[oaicite:5]{index=5}
Simulation of Repeated Request Attack (safe)
Interactive visualization — no network calls
0
300 ms
visual-only
Technical breakdown — how the pattern generates load
The reported pattern is conceptually simple: a repeating timer (the browser's `setInterval`) runs code that builds a URL with a randomized query string and calls `fetch()` (or an equivalent request). Because each query is unique, caches and CDNs cannot easily serve cached responses, so the target origin must process each request, often performing database work, search indexing, rendering or logging — all of which consume CPU, memory, and bandwidth.
Why randomized queries matter
A single unchanging URL can be cached; a randomized query usually defeats caching. If many clients each send unique queries rapidly, the server receives a high rate of cache-miss requests and must compute each response, producing resource exhaustion akin to DDoS traffic.
Measured effect (orders of magnitude)
For intuition: a single open page issuing 3 requests/sec produces ~10,800 requests/hour. Multiply that by tens or hundreds of simultaneous visitors and the request rate scales into the hundreds of thousands per hour — easily enough to overload modest sites.
Timeline & community reaction
The initial first-person report documenting the code, screenshots and timeline was published on Gyrovague; the post includes email excerpts, a GDPR complaint mention, and a paste of correspondence the author published. The thread sparked rapid community discussion on Hacker News and Reddit. :contentReference[oaicite:8]{index=8}
Allegations about operator conduct (reported)
The original investigator published redacted correspondence and reports that the site operator's replies included threatening language and requests tied to removal or rewriting of the post. Those correspondences are available as publicly posted excerpts referenced in the investigator’s article. Those are serious allegations and are presented here as reported by the investigator. :contentReference[oaicite:9]{index=9}
Embedded evidence & walkthrough videos
Community and independent analysts recorded video walkthroughs showing the code patterns and the network logs; below are example videos that were publicly shared.
Recommended mitigation (for site owners)
- Rate-limit expensive endpoints and return 429 for abusive patterns.
- Implement short-term caching / cheap cached responses for search queries with random tokens.
- Block referrers or request patterns originating from the archive CAPTCHA page if feasible, or use WAF rules to drop flood patterns.
- Monitor server logs for repeated requests with short intervals and random query patterns; collect timestamps and headers for abuse reports.
- Notify your host immediately; they can provide emergency mitigations (traffic shaping, CAPTCHAs, WAF).
Sources — read the primary materials
- Gyrovague — "archive.today is directing a DDOS attack against my blog" (first-person investigation with code, screenshots, and correspondence). :contentReference[oaicite:11]{index=11}
- Hacker News — Ask HN / discussion thread. :contentReference[oaicite:12]{index=12}
- Reddit /r/DataHoarder — community thread. :contentReference[oaicite:13]{index=13}
- Lobsters thread — community conversation (link included for completeness).
- Public paste: redacted correspondence (as linked in the Gyrovague post).
Comments
Post a Comment