Webscraping / grabbing data
Python
urllib / urllib2 https://docs.python.org/3/library/urllib.html
requests http://docs.python-requests.org/
BeautifulSoup / Beautiful Soup http://www.crummy.com/software/BeautifulSoup/
Mechanize http://wwwsearch.sourceforge.net/mechanize/
Windmill http://www.getwindmill.com/
Webscraping python https://code.google.com/p/webscraping/
Scrapy http://scrapy.org/ (framework to implement web spider)
imacros (Extension Firefox)
Selenium http://www.seleniumhq.org/ (web browser automation)
ChromeDriver https://code.google.com/p/selenium/wiki/ChromeDriver
Livre : Selenium webdriver Practical Guide
Livre : Selenium webdriver Practical Guide
Requestium https://github.com/tryolabs/requestium Python library that merges the power of Requests, Selenium, and Parsel into a single integrated tool for automatizing web actions.
PhatomJS http://phantomjs.org/
SlimerJS http://slimerjs.org/
TrifleJS http://triflejs.org/
CasperJS http://casperjs.org/
Spynner (Python) https://github.com/makinacorpus/spynner
python-spidermonkey (Python) https://code.google.com/p/python-spidermonkey/
Watir (Ruby) http://watir.com/
Articles:
http://www.packtpub.com/article/web-scraping-with-python
http://www.packtpub.com/article/web-scraping-with-python-part-2
http://www.gregreda.com/2013/03/03/web-scraping-101-with-python/
http://arunrocks.com/easy-practical-web-scraping-in-python/
voir aussi :
- Devel
- DataMining
- DataViz
- Webcrawling / Webcrawler
- HTTP
- HTML
- Javascript
There are no comments on this page. [Add comment]