I needed to download all pages from a wix-made website.
`wget -r` didn't work.
httrack , lynx didn't work, either.
I could download websites with CrawlSpider and FollowLink of https://github.com/scrapy/scrapy
https://www.youtube.com/watch?v=o1g8prnkuiQ
I'll use playwrite later, too. (instead of selenium)
https://scrapeops.io/python-scrapy-playbook/scrapy-playwright/
I found some candidates
https://github.com/crawlab-team/crawlab
No comments:
Post a Comment