Examples

CLI Examples

Initialize:

websweep init --headless

Crawl:

websweep crawl

Crawl and extract in one go (lower disk usage):

websweep crawl --extract

Extract:

websweep extract

Consolidate:

websweep consolidate

Recurring cycle example (e.g., monthly):

websweep crawl
websweep extract --start-date 2026-04-01 --end-date 2026-04-30
websweep consolidate

Filtering controls:

websweep crawl --allow-extensions pdf,png
websweep crawl --block-extensions pdf,png,zip

Backend note:

  • choose SQL vs TSV during websweep init (stored in settings.ini)

  • in SQL mode, WebSweep auto-picks DuckDB (preferred) or SQLite fallback