5 worst bugs I've seen on production #2: the infinite crawler
Published about 1 month ago

This surfaced after a normal rollout to a new client. Our scraper had been stable for months, but suddenly machines were busy and the load average crept up, with no obvious errors. The crawler’s logic was simple: find the “next” button by selector, read its href, and follow that URL to get the next page.
What is it?
A web crawler is a program that automatically follows links to fetch pages. If a loop sends it back to the start and it keeps following links, it can get stuck in an infinite loop. See: Web crawler (Wikipedia).
Problem
We identified the next page by querying the “next” button and following its link, roughly like this:
const next = document.querySelector('button.next') if (!next) done = true else url = next.getAttribute('href')
On this site, the last page didn’t remove the button. Instead, the same selector matched a “back to start” button that pointed to page 1. So when we reached the end, we jumped back to the beginning and continued forever.
Impact
Workers stayed busy. Each lap took about a second, so graphs looked healthy. The database told the truth: thousands of “pages” saved for an article that had ~50 real pages.
Signals that gave it away:
- Repeated URL pattern (page query flickering between last and 1)
- Duplicate content hashes for consecutive pages
- Page count exceeding a sensible cap
- Load average trending up without matching throughput gain
We used something like $('button.next') (or equivalent). On the last page, the same selector matched “back to start”, which sent us to page 1 again.
Solution
We added multiple signals before advancing: URL patterns, page counters, and content hashes to detect repeats. We set a hard page cap and made writes idempotent so re‑processing wouldn’t create new rows. We also tightened the selector logic to require a forward page number in the URL.
A safer crawl loop looked like this (pseudocode):
const seenUrls = new Set<string>() const seenHashes = new Set<string>() let page = 1 const maxPages = 200 while (true) { const html = await fetch(url) const hash = stableContentHash(html) if (seenUrls.has(url) || seenHashes.has(hash) || page > maxPages) { log.warn('stop: loop detected or cap hit', { url, page }) break } seenUrls.add(url) seenHashes.add(hash) await upsertPage({ url, html }) // idempotent write const nextEl = selectNextButton() const nextUrl = nextEl?.getAttribute('href') if (!nextUrl) break if (!isForwardLink(url, nextUrl)) break // ensure page number increases url = toAbsoluteUrl(nextUrl) page += 1 }
Lesson learned: add clear info logs (which page you’re crawling and which loop iteration you’re on), keep DB access handy to spot anomalies, and know your system’s flow—this bug was silent while machines crawled forever and load quietly climbed. Watching load‑average anomalies helps too; you can pinpoint the day and the commit that changed behavior and narrow the search fast.
Prevention checklist:
- Hard stop: enforce a strict page cap
- Loop detection: track seen URLs and content hashes
- Forward‑only: validate the next URL actually advances
- Idempotency: upsert writes to avoid duplication
- Telemetry: log page index, next URL, and decision reason
Read previous
Read next
#3 the €300,000 double refund →