Regex Sets and Ranges

Summary: in this tutorial, you’ll learn about the regex sets and ranges to create regular expressions that match a set of characters.

Sets

A set is one or more characters specified in square brackets. For example:

[abc]
Code language: PHP (php)

Since a set matches any characters in the square brackets, the [abc] set matches the character a, b and c.

The following example uses a set to match the string Jill or Hill:

<?php $pattern = '/[JH]ill/'; $title = 'Jack and Jill Went Up the Hill'; if (preg_match_all($pattern, $title, $matches)) { print_r($matches[0]); }
Code language: PHP (php)

Output:

Array ( [0] => Jill [1] => Hill )
Code language: PHP (php)

In this example, the set [JH] matches the character J or H. Therefore, the regular expression /[JH]ill/ matches Jill and Hill.

Ranges

Suppose you want to match many characters in a set, e.g., from a to z. If you list all of these characters in that square brackets, it would not be ideal.

Ranges allow you to specify a range of characters. For example, the [a-z] ranges from a to z.

Also, you can specify multiple ranges inside the square brackets. For example, the [a-z0-9] range matches characters from a to z and numbers from 0 to 9.

Similarly, the [a-zA-Z0-9_] is the same as the \w character class and the [0-9] range is the same as the \d.

Negate sets and ranges

To negate a set or range, you use the caret character (^) at the beginning of the set and range. For example, the range [^0-9] matches any character except a digit. It is the same as \D.

Notice that the caret (^) is also an anchor that matches the beginning of a string. If you use the caret (^) inside the square brackets, it behaves like a negation operator, not an anchor.

The following example uses the caret (^) to negate the set [aeoiu] to match the consonants in the string 'Hello':

<?php $pattern = '/[^aeoiu]/'; $title = 'Hello'; if (preg_match_all($pattern, $title, $matches)) { print_r($matches[0]); }
Code language: HTML, XML (xml)

Output:

Array ( [0] => H [1] => l [2] => l )
Code language: PHP (php)

Summary

  • A set matches any character specified in the square brackets.
  • A range matches any character in a range of characters.
  • To negate a set or range, you use the caret character [^...].
Did you find this tutorial useful?