blue dragon

blue dragon

Start crawling.

A Chrome extension for web crawling and SEO analysis. Fast parallel fetching, rich metadata extraction, and an interactive results table — all in a browser tab.

⚡ Manifest V3 📊 Tabulator table 🤖 robots.txt aware 🗓 Scheduled crawls v0.0.1

Installation

1

Add to Chrome

Visit the Chrome Web Store listing and click Add to Chrome, then confirm by clicking Add extension in the dialog that appears.

Blue dragon on the Chrome Web Store
2

Open blue dragon

Click the dragon icon in the Chrome toolbar. If you don't see it, click the puzzle-piece extensions menu and pin blue dragon. The extension opens as a full tab.

Quick start: crawl a site

1

Build your URL list

Enter a start URL in the Spider URL field and click fetch. Blue dragon fetches the page and extracts all linked URLs into the list below. Alternatively, paste URLs directly (one per line) or import from CSV / sitemap via the URLs menu.

Crawl tab with URL list populated
2

Start the crawl

Click crawl. Pages are fetched concurrently (default: 5 connections). A progress bar tracks completion. The crawl runs in the background service worker — you can close this tab and come back at any time.

Crawl in progress with progress bar
3

Explore the results

When the queue empties, the results table renders automatically. Each row is one URL. Rows are colour-coded: red for 4xx/5xx, amber for 3xx. Use the search box to filter, or click any column header to sort. Click a row to open the full URL detail panel.

Results table after crawl
Tip: Use Crawls → save current to persist results. Saved crawls can be reloaded later from Crawls → load.

Crawl modes

ModeWhat it doesGood for
list Crawls exactly the URLs in the list — no link following. Auditing a known set of pages (from a sitemap, CSV export, or manual list)
autonomous Recursive spider: starts from a single URL, discovers and follows internal links, stays on the same hostname. Capped by max pages. Full-site crawls where you don't have a URL list upfront

Switch between modes in Configuration. Autonomous mode auto-enables stay on hostname.

Stats tab

Live aggregated metrics — updates as each URL is fetched, and also populated when loading a saved crawl.

Stats tab showing summary, status codes, content types
SectionShows
SummaryTotal encountered, crawled, blocked, internal, external, indexable, non-indexable
Status codesCount per HTTP status — click any row to filter the results table
Content typesCount per content-type — click any row to filter the results table
Tip: Clicking Indexable or Internal URLs in Summary filters the results table and switches to the Crawl tab instantly.

Issues tab

After a crawl, the Issues tab runs SEO checks across all HTML pages and groups findings by severity.

Issues tab with SEO findings
CheckSeverity
Broken pages (4xx / 5xx)Error
Blocked by robots.txtWarning
RedirectsWarning
Missing titleWarning
Missing H1Warning
Missing meta descriptionInfo
Missing canonicalInfo
Duplicate titlesWarning
Broken outbound linksError

Clicking an issue row opens the URL detail view for that page.

Key features

Parallel fetching

Configurable concurrent connections (default 5, up to 20). Crawl delay and max retries are adjustable.

🔍

Rich metadata

Extracts title, description, H1, canonical, robots, Open Graph tags, and Schema.org structured data (publisher, dates, authors, headlines).

🤖

Robots.txt aware

Respects robots.txt per the configured User-Agent. Overrides can be set per hostname without touching the live server.

🗓

Scheduled crawls

Set up recurring crawls (hourly to weekly) with URL sources and a filter. Runs in the background even when the tab is closed.

📤

CSV export

Export all rows and all columns (including hidden ones) as CSV. Import CSVs or sitemap XML files as URL lists.

💾

HTML archiving

Optionally save raw HTML responses to the browser's Origin Private File System. View, open, or download individual pages.

Links