Skip to content

lorien/grab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

35e44c2 · Jul 1, 2023
Jul 1, 2023
Dec 26, 2022
Jul 1, 2023
Jul 1, 2023
Dec 21, 2022
Dec 27, 2022
Dec 16, 2022
Feb 17, 2022
Dec 24, 2022
Oct 9, 2022
Nov 6, 2016
Jul 1, 2023
Dec 24, 2022
Jul 1, 2023
Dec 7, 2022
Jul 1, 2023
Jul 1, 2023

Repository files navigation

Grab Framework Project

Grab Test Status Code Quality Type Check Grab Test Coverage Status Pypi Downloads Grab Documentation

Status of Project

I myself have not used Grab for many years. I am not sure it is being used by anybody at present time. Nonetheless I decided to refactor the project, just for fun. I have annotated whole code base with mypy type hints (in strict mode). Also the whole code base complies to pylint and flake8 requirements. There are few exceptions: very large methods and classes with too many local atributes and variables. I will refactor them eventually.

The current and the only network backend is urllib3.

I have refactored a few components into external packages: proxylist, procstat, selection, unicodec, user_agent

Feel free to give feedback in Telegram groups: @grablab and @grablab_ru

Things to be done next

  • Refactor source code to remove all pylint disable comments like:
    • too-many-instance-attributes
    • too-many-arguments
    • too-many-locals
    • too-many-public-methods
  • Make 100% test coverage, it is about 95% now
  • Release new version to pypi
  • Refactor more components into external packages
  • More abstract interfaces
  • More data structures and types
  • Decouple connections between internal components

Installation

That will install old Grab released in 2018 year: pip install -U grab

The updated Grab available in github repository is 100% not compatible with spiders and crawlers written for Grab released in 2018 year.

Documentation

Updated documenation is here https://grab.readthedocs.io/en/latest/ Most updates are removings content related to features I have removed from the Grab since 2018 year.

Documentation for old Grab version 0.6.41 (released in 2018 year) is here https://grab.readthedocs.io/en/v0.6.41-doc/