lexnlp.extract.en.addresses package¶
Subpackages¶
Submodules¶
lexnlp.extract.en.addresses.address_features module¶
Features extraction for addresses detecting classifier.
-
lexnlp.extract.en.addresses.address_features.build_country_words()¶
-
lexnlp.extract.en.addresses.address_features.build_provinces_words()¶
-
lexnlp.extract.en.addresses.address_features.get_word_features(word: str, part_of_speech: str) → List[int]¶
-
lexnlp.extract.en.addresses.address_features.is_datetime(word: str) → bool¶
-
lexnlp.extract.en.addresses.address_features.is_email(word: str) → bool¶
-
lexnlp.extract.en.addresses.address_features.is_lowercase_char(word: str) → bool¶
-
lexnlp.extract.en.addresses.address_features.is_single_initial(word: str) → bool¶
-
lexnlp.extract.en.addresses.address_features.is_uppercase_char(word: str) → bool¶
-
lexnlp.extract.en.addresses.address_features.is_url(word: str) → bool¶
-
lexnlp.extract.en.addresses.address_features.is_zip_code(s: str) → bool¶
lexnlp.extract.en.addresses.addresses module¶
Addresses extraction for English language.
-
class
lexnlp.extract.en.addresses.addresses.Address(zip_code: str, country: str, state: str, city: str, addr1: str, addr2: str)¶ Bases:
object-
members()¶
-
-
class
lexnlp.extract.en.addresses.addresses.NGramType¶ Bases:
object-
ADDR_END= 3¶
-
ADDR_MIDDLE= 2¶
-
ADDR_START= 1¶
-
OTHER= 0¶
-
-
lexnlp.extract.en.addresses.addresses.align_tokens(tokens, sentence)¶ Copy of the same function from nltk fixing processing of double quotes. :param tokens: :param sentence: :return:
-
lexnlp.extract.en.addresses.addresses.cleanup(address: str) → str¶
-
lexnlp.extract.en.addresses.addresses.get_address_spans(text: str) → Generator[[Tuple[str, int, int], None], None]¶
-
lexnlp.extract.en.addresses.addresses.get_addresses(text: str) → Generator[[str, None], None]¶
-
lexnlp.extract.en.addresses.addresses.load_classifier()¶
-
lexnlp.extract.en.addresses.addresses.prepare_ngrams_in_text(text: str, window_half_width: int, window_step: int) → Generator[[Tuple[List[int], List[str], int, int], None], None]¶