WebSweep
  • Installation
    • Requirements
    • Install from PyPI
    • Install from Source (Developers)
  • User Guide
    • Quickstart
    • What each component does
    • Library Quickstart and Workflow
      • Common library options (most used)
    • CLI Workflow (Detailed)
      • Backend setup (done during init)
      • How CLI configuration works
      • CLI commands and common options
      • How target_temp_folder_path works
      • Extractor date windows (how dates are used)
      • Recurring CLI pattern (every X months)
    • Custom Extraction Add-ons
    • URL Filtering Rules
    • Troubleshooting Statuses
  • Examples
    • CLI Examples
    • Featured Notebook (Parsed)
      • WebSweep Example for Researchers
  • websweep
    • websweep package
      • Subpackages
      • Submodules
      • websweep.config module
      • websweep.main module
      • Module contents
  • Contribute
    • How to Contribute
    • Developing WebSweep
    • Contact and Support
  • Contact Us
WebSweep
  • websweep
  • Edit on GitHub

websweep

  • websweep package
    • Subpackages
      • websweep.consolidator package
        • Submodules
        • websweep.consolidator.consolidator module
        • Module contents
      • websweep.crawler package
        • Submodules
        • websweep.crawler.crawler module
        • Module contents
      • websweep.extractor package
        • Submodules
        • websweep.extractor.add_host module
        • websweep.extractor.extractor module
        • Module contents
      • websweep.utils package
        • Submodules
        • websweep.utils.backend module
        • websweep.utils.json_io module
        • websweep.utils.public_suffix module
        • websweep.utils.source_urls module
        • websweep.utils.utils module
        • Module contents
    • Submodules
    • websweep.config module
      • current_websweep_instance()
      • init_app()
      • restore_app()
      • get_target_folder_path()
      • get_source_file_path()
      • get_extractor_delete()
      • get_extractor_addon_file()
      • get_use_database()
    • websweep.main module
      • operate()
      • init()
      • main()
      • restore()
      • cli_config()
      • websweep_address()
      • crawl()
      • extract()
      • consolidate()
    • Module contents
Previous Next

© Copyright 2026, ODISSEI Social Data Science.

Built with Sphinx using a theme provided by Read the Docs.