lexnlp.nlp.en.transforms package¶
Submodules¶
lexnlp.nlp.en.transforms.characters module¶
Transforms related to characters for English
-
lexnlp.nlp.en.transforms.characters.get_character_distribution(text, lowercase=False, stopword=False)¶ Get character distribution of text, potentially lowercasing and stopwording first. N.B. This method does not include or count whitespace.
- Parameters
text –
lowercase –
stopword –
- Returns
-
lexnlp.nlp.en.transforms.characters.get_character_ngram_distribution(text, n, lowercase=False, stopword=False)¶ Get character distribution of text, potentially lowercasing and stopwording first. N.B. This method does not include or count whitespace.
- Parameters
text –
lowercase –
stopword –
- Returns
lexnlp.nlp.en.transforms.tokens module¶
Transforms related to tokens for English
-
lexnlp.nlp.en.transforms.tokens.get_bigram_distribution(text: str, lowercase=False, stopword=False) → Dict[str, int]¶ Get bigram distribution from text. :param text: :param lowercase: :param stopword: :return:
-
lexnlp.nlp.en.transforms.tokens.get_ngram_distribution(text: str, n: int, lowercase=False, stopword=False) → Dict[str, int]¶ Get n-gram distribution of text, potentially lowercasing and stopwording first.
-
lexnlp.nlp.en.transforms.tokens.get_skipgram_distribution(text: str, n: int, k: int, lowercase=False, stopword=False) → Dict[str, int]¶ Get skipgram distribution from text.
- Parameters
text –
n –
k –
lowercase –
stopword –
- Returns
-
lexnlp.nlp.en.transforms.tokens.get_stem_distribution(text: str, lowercase=False, stopword=False) → Dict[str, int]¶ Get stemmed token distribution of text, potentially lowercasing and stopwording first.
-
lexnlp.nlp.en.transforms.tokens.get_token_distribution(text: str, lowercase=False, stopword=False) → Dict[str, int]¶ Get token distribution of text, potentially lowercasing and stopwording first.
-
lexnlp.nlp.en.transforms.tokens.get_trigram_distribution(text: str, lowercase=False, stopword=False) → Dict[str, int]¶ Get trigram distribution from text. :param text: :param lowercase: :param stopword: :return: