RegExPlus

POSIX character classes

Java already has support for POSIX character classes using the \p operator. RegExPlus adds support for them using the [:class:] POSIX bracket expressions.

\p{class}

Java already has support for POSIX character classes via the \p operator.

As an example, 0x\p{XDigit}++ will match a hex number, for example, 0xFF.

Supported class names
  • \p{Lower} A lower-case alphabetic character: [a-z]
  • \p{Upper} An upper-case alphabetic character: [A-Z]
  • \p{ASCII} All ASCII: [\x00-\x7F]
  • \p{Alpha} An alphabetic character: [\p{Lower}\p{Upper}]
  • \p{Digit} A decimal digit: [0-9]
  • \p{Alnum} An alphanumeric character: [\p{Alpha}\p{Digit}]
  • \p{Punct} Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
  • \p{Graph} A visible character: [\p{Alnum}\p{Punct}]
  • \p{Print} A printable character: [\p{Graph}\x20]
  • \p{Blank} A space or a tab: [ \t]
  • \p{Cntrl} A control character: [\x00-\x1F\x7F]
  • \p{XDigit} A hexadecimal digit: [0-9a-fA-F]
  • \p{Space} A whitespace character: [ \t\n\x0B\f\r]

[:class:]

RegExPlus adds support for POSIX classes in the form [:class:]. This is the same form supported by PCRE and Ruby.

As an example, 0x[[:xdigit:]]++ will match a hex number, for example, 0xFF.

Supported class names
  • [:lower:] A lower-case alphabetic character: [a-z]
  • [:upper:] An upper-case alphabetic character: [A-Z]
  • [:ascii:] All ASCII: [\x00-\x7F]
  • [:alpha:] An alphabetic character: [[:lower:][:upper:]]
  • [:digit:] A decimal digit: [0-9]
  • [:alnum:] An alphanumeric character: [[:alpha:][:digit:]]
  • [:punct:] Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
  • [:graph:] A visible character: [[:alnum:][:punct:]]
  • [:print:] A printable character: [[:graph:]\x20]
  • [:blank:] A space or a tab: [ \t]
  • [:cntrl:] A control character: [\x00-\x1F\x7F]
  • [:xdigit:] A hexadecimal digit: [0-9a-fA-F]
  • [:space:] A whitespace character: [ \t\n\x0B\f\r]
  • [:word:] A word character: [\w]
[:class:] outside of a character class

Note that unlike the \p{class} syntax, the [:class:] syntax is only allowed within a character class. For example, if you accidently forget to put the expression in a character class, 0x[:xdigit:]++, a PatternSyntaxException PatternSyntaxException will be thrown.