WebSweep
Installation
Requirements
Install from PyPI
Install from Source (Developers)
User Guide
Quickstart
What each component does
Library Quickstart and Workflow
Common library options (most used)
CLI Workflow (Detailed)
Backend setup (done during
init
)
How CLI configuration works
CLI commands and common options
How
target_temp_folder_path
works
Extractor date windows (how dates are used)
Recurring CLI pattern (every X months)
Custom Extraction Add-ons
URL Filtering Rules
Troubleshooting Statuses
Examples
CLI Examples
Featured Notebook (Parsed)
WebSweep Example for Researchers
websweep
websweep package
Subpackages
Submodules
websweep.config module
websweep.main module
Module contents
Contribute
How to Contribute
Developing WebSweep
Contact and Support
Contact Us
WebSweep
Overview: module code
All modules for which code is available
websweep.config
websweep.consolidator.consolidator
websweep.crawler.crawler
websweep.extractor.extractor
websweep.main
websweep.utils.backend
websweep.utils.json_io
websweep.utils.public_suffix
websweep.utils.source_urls
websweep.utils.utils