Web



Webscraping / grabbing data


Python
urllib / urllib2 https://docs.python.org/3/library/urllib.html
requests http://docs.python-requests.org/
BeautifulSoup / Beautiful Soup http://www.crummy.com/software/BeautifulSoup/

Mechanize http://wwwsearch.sourceforge.net/mechanize/
Windmill http://www.getwindmill.com/
Webscraping python https://code.google.com/p/webscraping/

Scrapy http://scrapy.org/ (framework to implement web spider)

imacros (Extension Firefox)


Selenium http://www.seleniumhq.org/ (web browser automation)
ChromeDriver https://code.google.com/p/selenium/wiki/ChromeDriver
Livre : Selenium webdriver Practical Guide

Requestium https://github.com/tryolabs/requestium Python library that merges the power of Requests, Selenium, and Parsel into a single integrated tool for automatizing web actions.

PhatomJS http://phantomjs.org/
SlimerJS http://slimerjs.org/
TrifleJS http://triflejs.org/
CasperJS http://casperjs.org/

Spynner (Python) https://github.com/makinacorpus/spynner
python-spidermonkey (Python) https://code.google.com/p/python-spidermonkey/

Watir (Ruby) http://watir.com/

Articles:
http://www.packtpub.com/article/web-scraping-with-python
http://www.packtpub.com/article/web-scraping-with-python-part-2
http://www.gregreda.com/2013/03/03/web-scraping-101-with-python/
http://arunrocks.com/easy-practical-web-scraping-in-python/

voir aussi :
- Devel
- DataMining
- DataViz
- Webcrawling / Webcrawler
- HTTP
- HTML
- Javascript

There are no comments on this page. [Add comment]

Valid XHTML 1.0 Transitional :: Valid CSS :: Powered by WikkaWiki