Metadata-Version: 2.1
Name: kr_sentence
Version: 0.0.3
Summary: Light-weight sentence tokenizer for Korean.
Home-page: https://github.com/Rairye/kr-sentence
Author: Rairye
License: apache-2.0
Download-URL: https://github.com/Rairye/kr-sentence/archive/refs/tags/v0.0.3.tar.gz
Keywords: Korean,Sentence,Tokenizer
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: End Users/Desktop
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.1
Classifier: Programming Language :: Python :: 3.2
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Description-Content-Type: text/markdown
License-File: LICENSE.txt

A light-weight sentence tokenizer for Korean. 

Half-width punctuation is generally used in Korean, but this tokenizer also supports full-width punctuation. (For details about full-width punctuation in Korean, please see https://www.w3.org/TR/klreq/).

Sample Code:

from kr_sentence.tokenizer import tokenize

paragraph_str = "저는 미국인이에요. 만나서 반갑습니다."

sentence_list = tokenize(paragraph_str)

for sentence in sentence_list:
	print(sentence)


