Java-Pattern and Matcher

Regular Expression Processing

The java.util.regex package supports regular expression processing. A regular expression is a string of characters that describes a character sequence. This general description, called a pattern, can then be used to find matches in other character sequences. Regular expressions can specify wildcard characters, sets of characters, and various quantifiers. Thus, you can specify a regular expression that represents a general form that can match several different specific character sequences. There are two classes that support regular expression processing: Pattern and Matcher. These classes work together. Use Pattern to define a regular expression. Match the pattern against another sequence using Matcher.

Pattern and Matcher

The Pattern class defines no constructors. Instead, a pattern is created by calling the compile(String pattern) factory method. Here, pattern is the regular expression that you want to use. The compile( ) method transforms the string in pattern into a pattern that can be used for pattern matching by the Matcher class. It returns a Pattern object that contains the pattern.

Once you have created a Pattern object, you will use it to create a Matcher. This is done by calling the matcher(CharSequence str) factory method. Here, str is the character sequence that the pattern will be matched against. This is called the input sequence. CharSequence is an interface that defines a read-only set of characters. It is implemented by the String class, among others. Thus, you can pass a string to matcher( ).

The Matcher class has no constructors. Instead, you create a Matcher by calling the matcher( ) factory method defined by Pattern, as just explained. Once you have created a Matcher, you will use its methods to perform various pattern matching operations. The simplest pattern matching method is matches( ), which simply determines whether the character sequence matches the pattern. It returns true if the sequence and the pattern match, and false otherwise. Understand that the entire sequence must match the pattern, not just a subsequence of it.

Regular Expression Syntax

Before demonstrating Pattern and Matcher, it is necessary to explain how to construct a regular expression. Although no rule is complicated by itself, there are a large number of them. A few of the more commonly used constructs are described here. In general, a regular expression is comprised of normal characters, character classes (sets of characters), wildcard characters, and quantifiers.

A normal character A normal character is matched as-is.Thus, if a pattern consists of “xy”, then the only input sequence that will match it is “xy”. Characters such as newline and tab are specified using the standard escape sequences, which begin with a \. For example, a newline is specified by \n. In the language of regular expressions, a normal character is also called a literal.
A character class A character class is a set of characters. A character class is specified by putting the characters in the classbetween brackets. For example, the class [wxyz] matches w, x, y, or z. To specify an inverted set, precede the characters with a ^. For example, [^wxyz] matches any character except w, x, y, or z. You can specify arange of characters using a hyphen. For example, to specify a character class that will match the digits 1 through 9, use [1-9].
The wildcard character The wildcard character is the . (dot) and it matches any character. Thus, a pattern thatconsists of “.” will match these (and other) input sequences: “A”, “a”, “x”, and so on.
A quantifier A quantifier determines how many times an expression is matched. The quantifiers are shown here:

+    Match one or more.

*     Match zero or more.

?    Match zero or one.

For example, the pattern “x+” will match “x”, “xx”, and “xxx”, among others. One other point: In general, if you specify an invalid expression, a PatternSyntaxException will be thrown.

Demonstrating Pattern Matching

The best way to understand how regular expression pattern matching operates is to work through some examples. The first, the following program shown here, looks for a match with a normal character pattern: The program begins by creating the pattern that contains the sequence “JavaFX”. Next, a Matcher is created for that pattern that has the input sequence “JavaFX”. Then, the matches( ) method is called to determine if the input sequence matches the pattern. Because the sequence and the pattern are the same, matches( ) returns true. Next, a new Matcher is created with the input sequence “JavaSwing” and matches( ) is called again. In this case, the pattern and the input sequence differ, and no match is found. Remember, the matches( ) function returns true only when the input sequence precisely matches the pattern. It will not return true just because a subsequence matches.

Program


Program Source

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class Javaapp {
  
    public static void main(String[] args) {
        
        Pattern pat = Pattern.compile("JavaFX");
        Matcher mat = pat.matcher("JavaFX");
        
        if(mat.matches())
            System.out.println("1 : Matches");
        else
            System.out.println("1 : No Match");
        
        mat = pat.matcher("JavaSwing");
        
        if(mat.matches())
            System.out.println("2 : Matches");
        else
            System.out.println("2 : No Match");
    }
}

 

Leave a Comment