Notes:
Regular expressions, also known as RegEx, are a set of characters and symbols that are used to match patterns in text. They are commonly used for text manipulation and data extraction tasks such as searching, replacing, and validating text. They can be used to normalize text by finding and removing or replacing specific characters or patterns of characters that may not be needed or that may cause issues when analyzing or processing the text. For example, they can be used to remove special characters, white spaces, or numbers from a text. They can also be used to match certain patterns, such as email addresses, phone numbers, and dates, and to extract the desired information.
Wikipedia:
See also:
Text Normalization & Dialog Systems