Optimal Scraping Technique: CSS Selector, XPath, & RegEx

Web scraping deals with HTML almost exclusively. In nearly all cases, what is required is a small sample from a very large file (e.g. pricing information from an ecommerce page). Therefore, an essential part of scraping is searching through an HTML document and finding the correct information.

How that should be done is the matter of some debate, preferences, experience, and types of data. While all scraping and parsing methods are “correct”, some of them have benefits that may be vital when more optimization is required. Some methods may be easier for specific types of data. However, outside of a lone scientific study, succinct information on the differences and practicality




To finish reading, please visit source site