Extract city and country mentions from Text like GeoText without regex

flashgeotext

Extract and count countries and cities (+their synonyms) from text, like GeoText on steroids using FlashText, a Aho-Corasick implementation. Flashgeotext is a fast, batteries-included (and BYOD) and native python library that extracts one or more sets of given city and country names (+ synonyms) from an input text.

Usage

from flashgeotext.geotext import GeoText

geotext = GeoText()

input_text = '''Shanghai. The Chinese Ministry of Finance in Shanghai said that China plans
to cut tariffs on $75 billion worth of goods that the country
imports from the US. Washington welcomes the decision.'''

geotext.extract(input_text=input_text, span_info=True)
>> {
'cities': {
'Shanghai': {
'count': 2,
'span_info': [(0, 8), (45, 53)]
},
'Washington, D.C.': {
'count': 1,
'span_info': [(175, 185)]
}
},
'countries': {
'China': {
'count': 1,
'span_info': [(64,

 

 

 

To finish reading, please visit source site