Extracting information from reports using Regular Expressions Library in Python

Introduction Many times it is necessary to extract key information from reports, articles, papers, etc. For example names of companies – prices from financial reports, names of judges – jurisdiction from court judgments, account numbers from customer complaints, etc. These extractions are part of Text Mining and are essential in converting unstructured data to a structured form which are later used for applying analytics/machine learning. Such entity extraction uses approaches like ‘lookup’, ‘rules’ and ‘statistical/machine learning’. In ‘lookup’ based approaches, […]

Read more

Beginners Tutorial for Regular Expressions in Python

Importance of Regular Expressions In last few years, there has been a dramatic shift in usage of general purpose programming languages for data science and machine learning. This was not always the case – a decade back this thought would have met a lot of skeptic eyes! This means that more people / organizations are using tools like Python / JavaScript for solving their data needs. This is where Regular Expressions become super useful. Regular expressions are normally the default way […]

Read more