semantic_router.splitters.utils.split_to_sentences#
- semantic_router.splitters.utils.split_to_sentences(text)#
Enhanced regex pattern to split a given text into sentences more accurately.
The enhanced regex pattern includes handling for: - Direct speech and quotations. - Abbreviations, initials, and acronyms. - Decimal numbers and dates. - Ellipses and other punctuation marks used in informal text. - Removing control characters and format characters.
- Return type:
List[str]
- Args:
text (str): The text to split into sentences.
- Returns:
list: A list of sentences extracted from the text.