AK
Overview · last 24h

Crawl throughput is holding steady at 22k pages/hour.

5 jobs are running across 26 workers. Two sites throttled us this morning; backoff policies handled it without manual intervention.

Pages today
24,118
+12.4%
Avg latency
412ms
−8ms
Active crawlers
26
live
Storage used
184 GB
of 500 GB

Live crawls

4 active · 1 queued
browsercj_8f12a4
running
docs.python.org/3/library/asyncio-task.html
css:.section + .markdown
142 / 41234%
BYTES
9.4 MB
WORKERS
6
ELAPSED
3m 38s
deepcj_b39e21
running
arxiv.org/list/cs.LG/2026-07
llm:abstract+authors
1,843 / 5,20035%
BYTES
82.4 MB
WORKERS
12
ELAPSED
53m 00s
httpcj_2c1d09
running
news.ycombinator.com/best
xpath://article
17 / 3057%
BYTES
1.1 MB
WORKERS
4
ELAPSED
51s
browsercj_a47f8c
queued
github.com/anthropics/anthropic-sdk-python/tree/main
css:article.markdown-body
0 / 2400%
BYTES
0 B
WORKERS
6
ELAPSED
0s
browsercj_77b201
running
shop.example.com/products?category=ai-tools
schema:Product
41 / 11037%
BYTES
3.8 MB
WORKERS
4
ELAPSED
5m 22s

Recent runs

View all →
URLStatus
en.wikipedia.org/wiki/RAG_(information_retrieval)
cj_91a3f0
completed
stackoverflow.com/questions/tagged/playwright
cj_40b8d2
completed
reddit.com/r/LocalLLaMA/top?t=week
cj_3ae51c
failed
huggingface.co/datasets?sort=downloads
cj_5d8a99
completed
twitter.com/search?q=crawl4ai
cj_e1f0aa
rate-limited
anthropic.com/research
cj_6b2c4d
completed

Activity

01:55:48INFOqueued job cj_a47f8c → github.com/anthropics/anthropic-sdk-python
01:55:42INFOcrawler-04 acquired URL batch (12 pages)
01:55:39DEBUGextracted <article.markdown-body> from page 17/30
01:55:38INFOjob cj_2c1d09 throughput 22 pages/min
01:55:30WARNrate limit warning: arxiv.org returned 429 × 3 (backing off 12s)
01:55:24INFOjob cj_b39e21 worker pool scaled 8 → 12
01:55:18DEBUGcss:.section + .markdown → 41 matches (page 142/412)
01:55:12ERRORjob cj_3ae51c permanently failed: reddit auth wall
01:55:01INFOscheduler tick: 5 running · 1 queued · 0 failed