0% completed
Regular expressions (regex) are sequences of characters that define search patterns. They are incredibly powerful tools for searching, validating, and manipulating text. In Java, regex is used extensively through the Pattern
and Matcher
classes. For beginners, it is important to understand the various components of a regex, such as meta characters, quantifiers, and flags, and how these elements work together to form a pattern.
Before we learn to create and use the regular expression, let's learn few key terms used in creating the regular expression.
Meta characters are special symbols that have specific meanings in a regex. They help control how the pattern matches text. Here’s a table summarizing some common meta characters along with explanations:
Metacharacter | Description | Example |
---|---|---|
| | Alternation operator: matches any one of the patterns separated by | . | cat|dog|fish matches "cat", "dog", or "fish". |
. | Matches any single character (except a newline). | a.c matches "abc", "aXc", etc. |
^ | Anchors the match at the beginning of a string or line. | ^Hello matches any string that starts with "Hello". |
$ | Anchors the match at the end of a string or line. | World$ matches any string that ends with "World". |
\d | Matches any digit (0-9). | \d matches "5" in "a5b". |
\s | Matches any whitespace character (space, tab, newline). | \s matches a space in "Hello World". |
\b | Matches a word boundary (the position between a word and a non-word character). | \bword\b matches "word" as a whole word. |
\uxxxx | Matches the Unicode character specified by the hexadecimal number xxxx . | \u0041 matches "A". |
Quantifiers define the number of times a character or group should appear in the input for a match to be valid. The table below explains the most common quantifiers:
Quantifier | Description | Example |
---|---|---|
n+ | Matches one or more occurrences of the preceding element n . | a+ matches "a", "aa", "aaa", etc. |
n* | Matches zero or more occurrences of the preceding element n . | a* matches "", "a", "aa", "aaa", etc. |
n? | Matches zero or one occurrence of the preceding element n . | a? matches "" or "a". |
n{x} | Matches exactly x occurrences of the preceding element n . | a{3} matches "aaa". |
n{x,y} | Matches between x and y occurrences (inclusive) of the preceding element n`. | a{2,4} matches "aa", "aaa", or "aaaa". |
n{x,} | Matches at least x occurrences of the preceding element n . | a{2,} matches "aa", "aaa", etc. |
Flags, also known as modifiers, adjust the default behavior of a regex pattern. They can be embedded in the pattern itself or passed as parameters when compiling the regex. Here is a summary of common flags:
Flag/Modifier | Description | Example |
---|---|---|
(?i) or Pattern.CASE_INSENSITIVE | Enables case-insensitive matching, so uppercase and lowercase letters are treated as equal. | (?i)cat matches "Cat", "cAt", "CAT", etc. |
(?m) or Pattern.MULTILINE | Changes the behavior of ^ and $ so that they match the start and end of each line, respectively. | Useful when working with multi-line text. |
(?s) or Pattern.DOTALL | Allows the dot . to match newline characters, making it match any character, including line terminators. | Enables . to match across lines. |
(?x) or Pattern.COMMENTS | Permits whitespace and comments within the regex pattern, which are ignored, thus enhancing readability. | Helps in writing complex regex patterns with comments. |
Simple 10-Digit Number (Digits Only):
^\d{10}$
^
asserts the start of the string.\d{10}
matches exactly 10 digits.$
asserts the end of the string.Formatted with Hyphens (e.g., 123-456-7890):
^\d{3}-\d{3}-\d{4}$
\d{3}
matches exactly 3 digits, followed by a hyphen (-
).^[A-Za-z0-9+_.-]+@[A-Za-z0-9.-]+$
^
asserts the start of the string.[A-Za-z0-9+_.-]+
matches one or more allowed characters (letters, digits, plus, underscore, dot, or hyphen) before the @
symbol.@
matches the literal character @
.[A-Za-z0-9.-]+
matches one or more allowed characters for the domain.$
asserts the end of the string.Regular expressions are versatile tools for text processing in Java. By understanding the roles of meta characters, quantifiers, and flags, you can build complex patterns to validate and manipulate text.
In the next lesson, we will use how to use the regular expression for pattern matching and searching.
.....
.....
.....