LabelCloud

Launch the following command in the console to crawl the labels from the label cloud e.g. Etherscan's label cloud. Before doing this, please make sure you have installed chromedriver.

scrapy crawl labels.labelcloud \
-a out=/path/to/output/data \
-a labels=exchange \
-a categories=accounts 

the argument for Etherscan label cloud spider including:

  • out: the output directory, the default is ./data.

  • site: the site of label cloud, the default is etherscan, and the other options include bscscan, polygonscan, and hecoinfo.

  • labels(optional): the specified label name joined by , supporting fuzzy matching, if not set, defaults to crawl all labels.

  • categories(optional): the specified categories joined by , supporting fuzzy matching, if not set, defaults to crawl accounts label. Other optional includes tokens ,transactions and blocks.

Last updated