Java-Pattern Flags

Pattern Flags

A further version of the compile(String regex, int flags) method enables you to control more closely how the pattern is applied when looking for a match. The second argument is a value of type int that specifies one or more of the following flags that are defined in the Pattern class :

CASE_INSENSITIVE Matches ignoring case, but assumes only US-ASCII characters are being matched.

DOTALL Makes the expression match any character, including line terminators. By default this expression does not match line terminators.

LITERAL When this flag is specified then the input string that specifies the pattern is treated as a sequence of literal characters. Metacharacters or escape sequences in the input sequence will be given no special meaning.

COMMENTS Allows whitespace and comments in a pattern. Comments in a pattern start with # so from the first # to the end of the line is ignored.

MULTILINE Enables the beginning or end of lines to be matched anywhere. Without this flag only the beginning and end of the entire sequence is matched.

UNIX_LINES Enables UNIX lines mode, where only ‘\n’ is recognized as a line terminator

UNICODE_CASE When this is specified in addition to CASE_INSENSITIVE, case-insensitive matching is consistent with the Unicode standard.

UNICODE_CHARACTER_CLASS Enables the Unicode version of predefined character classes.

CANON_EQ Matches taking account of canonical equivalence of combined characters. For example, some characters that have diacritics may be represented as a single character or as a single character with a diacritic followed by a diacritic character. This flag treats these as a match.

All these flags are unique single-bit values within a value of type int so you can combine them by ORing them together or by simple addition. For example, you can specify the CASE_INSENSITIVE and the UNICODE_CASE flags with the expression: Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE or you can write this as: Pattern.CASE_INSENSITIVE + Pattern.UNICODE_CASE.

Program

Program Source

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class Javaapp {
  
    public static void main(String[] args) {
        
        Pattern pat = Pattern.compile("\\w+",Pattern.UNICODE_CHARACTER_CLASS);
        Matcher mat = pat.matcher("ABCÀÁÂÃDEF");

        int i = 0;
        while(mat.find())
        { 
            i++;
            System.out.println(i+"th subsequence  : "+mat.group());
        }    
    }
}

Leave a Comment