- GUI
- Windows API tutorial
- Introduction to Windows API
- Windows API main functions
- System functions in Windows API
- Strings in Windows API
- Date & time in Windows API
- A window in Windows API
- First steps in UI
- Windows API menus
- Windows API dialogs
- Windows API controls I
- Windows API controls II
- Windows API controls III
- Advanced controls in Windows API
- Custom controls in Windows API
- The GDI in Windows API
- PyQt4 tutorial
- PyQt5 tutorial
- Qt4 tutorial
- Introduction to Qt4 toolkit
- Qt4 utility classes
- Strings in Qt4
- Date and time in Qt4
- Working with files and directories in Qt4
- First programs in Qt4
- Menus and toolbars in Qt4
- Layout management in Qt4
- Events and signals in Qt4
- Qt4 Widgets
- Qt4 Widgets II
- Painting in Qt4
- Custom widget in Qt4
- The Breakout game in Qt4
- Qt5 tutorial
- Introduction to Qt5 toolkit
- Strings in Qt5
- Date and time in Qt5
- Containers in Qt5
- Working with files and directories in Qt5
- First programs in Qt5
- Menus and toolbars in Qt5
- Layout management in Qt5
- Events and signals in Qt5
- Qt5 Widgets
- Qt5 Widgets II
- Painting in Qt5
- Custom widget in Qt5
- Snake in Qt5
- The Breakout game in Qt5
- PySide tutorial
- Tkinter tutorial
- Tcl/Tk tutorial
- Qt Quick tutorial
- Java Swing tutorial
- JavaFX tutorial
- Java SWT tutorial
- wxWidgets tutorial
- Introduction to wxWidgets
- wxWidgets helper classes
- First programs in wxWidgets
- Menus and toolbars in wxWidgets
- Layout management in wxWidgets
- Events in wxWidgets
- Dialogs in wxWidgets
- wxWidgets widgets
- wxWidgets widgets II
- Drag and Drop in wxWidgets
- Device Contexts in wxWidgets
- Custom widgets in wxWidgets
- The Tetris game in wxWidgets
- wxPython tutorial
- Introduction to wxPython
- First Steps
- Menus and toolbars
- Layout management in wxPython
- Events in wxPython
- wxPython dialogs
- Widgets
- Advanced widgets in wxPython
- Drag and drop in wxPython
- Internationalisation
- Application skeletons in wxPython
- The GDI
- Mapping modes
- Creating custom widgets
- Tips and Tricks
- wxPython Gripts
- The Tetris game in wxPython
- C# Winforms Mono tutorial
- Java Gnome tutorial
- Introduction to Java Gnome
- First steps in Java Gnome
- Layout management in Java Gnome
- Layout management II in Java Gnome
- Menus in Java Gnome
- Toolbars in Java Gnome
- Events in Java Gnome
- Widgets in Java Gnome
- Widgets II in Java Gnome
- Advanced widgets in Java Gnome
- Dialogs in Java Gnome
- Pango in Java Gnome
- Drawing with Cairo in Java Gnome
- Drawing with Cairo II
- Nibbles in Java Gnome
- QtJambi tutorial
- GTK+ tutorial
- Ruby GTK tutorial
- GTK# tutorial
- Visual Basic GTK# tutorial
- PyGTK tutorial
- Introduction to PyGTK
- First steps in PyGTK
- Layout management in PyGTK
- Menus in PyGTK
- Toolbars in PyGTK
- Signals & events in PyGTK
- Widgets in PyGTK
- Widgets II in PyGTK
- Advanced widgets in PyGTK
- Dialogs in PyGTK
- Pango
- Pango II
- Drawing with Cairo in PyGTK
- Drawing with Cairo II
- Snake game in PyGTK
- Custom widget in PyGTK
- PHP GTK tutorial
- C# Qyoto tutorial
- Ruby Qt tutorial
- Visual Basic Qyoto tutorial
- Mono IronPython Winforms tutorial
- Introduction
- First steps in IronPython Mono Winforms
- Layout management
- Menus and toolbars
- Basic Controls in Mono Winforms
- Basic Controls II in Mono Winforms
- Advanced Controls in Mono Winforms
- Dialogs
- Drag & drop in Mono Winforms
- Painting
- Painting II in IronPython Mono Winforms
- Snake in IronPython Mono Winforms
- The Tetris game in IronPython Mono Winforms
- FreeBASIC GTK tutorial
- Jython Swing tutorial
- JRuby Swing tutorial
- Visual Basic Winforms tutorial
- JavaScript GTK tutorial
- Ruby HTTPClient tutorial
- Ruby Faraday tutorial
- Ruby Net::HTTP tutorial
- Java 2D games tutorial
- Java 2D tutorial
- Cairo graphics tutorial
- PyCairo tutorial
- HTML5 canvas tutorial
- Python tutorial
- Python language
- Interactive Python
- Python lexical structure
- Python data types
- Strings in Python
- Python lists
- Python dictionaries
- Python operators
- Keywords in Python
- Functions in Python
- Files in Python
- Object-oriented programming in Python
- Modules
- Packages in Python
- Exceptions in Python
- Iterators and Generators
- Introspection in Python
- Ruby tutorial
- PHP tutorial
- Visual Basic tutorial
- Visual Basic
- Visual Basic lexical structure
- Basics
- Visual Basic data types
- Strings in Visual Basic
- Operators
- Flow control
- Visual Basic arrays
- Procedures & functions in Visual Basic
- Organizing code in Visual Basic
- Object-oriented programming
- Object-oriented programming II in Visual Basic
- Collections in Visual Basic
- Input & output
- Tcl tutorial
- C# tutorial
- Java tutorial
- AWK tutorial
- Jetty tutorial
- Tomcat Derby tutorial
- Jtwig tutorial
- Android tutorial
- Introduction to Android development
- First Android application
- Android Button widgets
- Android Intents
- Layout management in Android
- Android Spinner widget
- SeekBar widget
- Android ProgressBar widget
- Android ListView widget
- Android Pickers
- Android menus
- Dialogs
- Drawing in Android
- Java EE 5 tutorials
- Introduction
- Installing Java
- Installing NetBeans 6
- Java Application Servers
- Resin CGIServlet
- JavaServer Pages, (JSPs)
- Implicit objects in JSPs
- Shopping cart
- JSP & MySQL Database
- Java Servlets
- Sending email in a Servlet
- Creating a captcha in a Servlet
- DataSource & DriverManager
- Java Beans
- Custom JSP tags
- Object relational mapping with iBATIS
- Jsoup tutorial
- MySQL tutorial
- MySQL quick tutorial
- MySQL storage engines
- MySQL data types
- Creating, altering and dropping tables in MySQL
- MySQL expressions
- Inserting, updating, and deleting data in MySQL
- The SELECT statement in MySQL
- MySQL subqueries
- MySQL constraints
- Exporting and importing data in MySQL
- Joining tables in MySQL
- MySQL functions
- Views in MySQL
- Transactions in MySQL
- MySQL stored routines
- MySQL Python tutorial
- MySQL Perl tutorial
- MySQL C API programming tutorial
- MySQL Visual Basic tutorial
- MySQL PHP tutorial
- MySQL Java tutorial
- MySQL Ruby tutorial
- MySQL C# tutorial
- SQLite tutorial
- SQLite C tutorial
- SQLite PHP tutorial
- SQLite Python tutorial
- SQLite Perl tutorial
- SQLite Ruby tutorial
- SQLite C# tutorial
- SQLite Visual Basic tutorial
- PostgreSQL C tutorial
- PostgreSQL Python tutorial
- PostgreSQL Ruby tutorial
- PostgreSQL PHP tutorial
- PostgreSQL Java tutorial
- Apache Derby tutorial
- SQLAlchemy tutorial
- MongoDB PHP tutorial
- MongoDB Java tutorial
- MongoDB JavaScript tutorial
- MongoDB Ruby tutorial
- Spring JdbcTemplate tutorial
- JDBI tutorial
Regular expressions in PHP
In this part of the PHP tutorial, we cover regular expressions in PHP.
Regular expressions are used for text searching and more advanced text manipulation. Regular expressions are built-in tools like grep, sed, text editors like vi, emacs, programming languages like Tcl, Perl, and Python. PHP has a built-in support for regular expressions too.
In PHP, there are two modules for regular expressions: the POSIX Regex and the PCRE. The POSIX Regex is depreciated. In this chapter, we will use the PCRE examples. PCRE stands for Perl compatible regular expressions.
Two things are needed when we work with regular expressions: Regex functions and the pattern.
A pattern is a regular expression that defines the text we are searching for or manipulating. It consists of text literals and metacharacters. The pattern is placed inside two delimiters. These are usually //
, ##
, or @@
characters. They inform the regex function where the pattern starts and ends.
Here is a partial list of metacharacters used in PCRE.
. | Matches any single character. |
* | Matches the preceding element zero or more times. |
[ ] | Bracket expression. Matches a character within the brackets. |
[^ ] | Matches a single character that is not contained within the brackets. |
^ | Matches the starting position within the string. |
$ | Matches the ending position within the string. |
| | Alternation operator. |
PRCE functions
We define some PCRE regex functions. They all have a preg prefix.
preg_split()
- splits a string by regex patternpreg_match()
- performs a regex matchpreg_replace()
- search and replace string by regex patternpreg_grep()
- returns array entries that match the regex pattern
Next we will have an example for each function.
php > print_r(preg_split("@\s@", "Jane\tKate\nLucy Marion")); Array ( [0] => Jane [1] => Kate [2] => Lucy [3] => Marion )
We have four names divided by spaces. The \s
is a character class which stands for spaces. The preg_split()
function returns the split strings in an array.
php > echo preg_match("#[a-z]#", "s"); 1
The preg_match()
function looks if the 's' character is in the character class [a-z]
. The class stands for all characters from a to z. It returns 1 for success.
php > echo preg_replace("/Jane/","Beky","I saw Jane. Jane was beautiful."); I saw Beky. Beky was beautiful.
The preg_replace()
function replaces all occurrences of the word 'Jane' for the word 'Beky'.
php > print_r(preg_grep("#Jane#", ["Jane", "jane", "Joan", "JANE"])); Array ( [0] => Jane )
The preg_grep()
function returns an array of words that match the given pattern. In this example, only one word is returned in the array. This is because by default, the search is case sensitive.
php > print_r(preg_grep("#Jane#i", ["Jane", "jane", "Joan", "JANE"])); Array ( [0] => Jane [1] => jane [3] => JANE )
In this example, we perform a case insensitive grep. We put the i
modifier after the right delimiter. The returned array has now three words.
The dot metacharacter
The .
(dot) metacharacter stands for any single character in the text.
single.php
<?php $words = [ "Seven", "even", "Maven", "Amen", "Leven" ]; $pattern = "/.even/"; foreach ($words as $word) { if (preg_match($pattern, $word)) { echo "$word matches the pattern\n"; } else { echo "$word does not match the pattern\n"; } } ?>
In the $words
array, we have five words.
$pattern = "/.even/";
Here we define the search pattern. The pattern is a string. The regular expression is placed within delimiters. The delimiters are mandatory. In our case, we use forward slashes / /
as delimiters. Note that we can use different delimiters if we want. The dot character stands for any single character.
if (preg_match($pattern, $word)) { echo "$word matches the pattern\n"; } else { echo "$word does not match the pattern\n"; }
We test all five words if they match with the pattern.
$ php single.php Seven matches the pattern even does not match the pattern Maven does not match the pattern Amen does not match the pattern Leven matches the pattern
The Seven and Leven words match our search pattern.
Anchors
Anchors match positions of characters inside a given text.
In the next example, we look if a string is located at the beginning of a sentence.
anchors.php
<?php $sentence1 = "Everywhere I look I see Jane"; $sentence2 = "Jane is the best thing that happened to me"; if (preg_match("/^Jane/", $sentence1)) { echo "Jane is at the beginning of the \$sentence1\n"; } else { echo "Jane is not at the beginning of the \$sentence1\n"; } if (preg_match("/^Jane/", $sentence2)) { echo "Jane is at the beginning of the \$sentence2\n"; } else { echo "Jane is not at the beginning of the \$sentence2\n"; } ?>
We have two sentences. The pattern is ^Jane
. The pattern checks if the 'Jane' string located at the beginning of the text.
$ php anchors.php Jane is not at the beginning of the $sentence1 Jane is at the beginning of the $sentence2
php > echo preg_match("#Jane$#", "I love Jane"); 1 php > echo preg_match("#Jane$#", "Jane does not love me"); 0
The Jane$
pattern matches a string in which the word Jane is at the end.
Exact word match
In the following examples we show how to look for exact word matches.
php > echo preg_match("/mother/", "mother"); 1 php > echo preg_match("/mother/", "motherboard"); 1 php > echo preg_match("/mother/", "motherland"); 1
The mother
pattern fits the words mother, motherboard and motherland. Say, we want to look just for exact word matches. We will use the aforementioned anchor ^
and $
characters.
php > echo preg_match("/^mother$/", "motherland"); 0 php > echo preg_match("/^mother$/", "Who is your mother?"); 0 php > echo preg_match("/^mother$/", "mother"); 1
Using the anchor characters, we get an exact word match for a pattern.
Quantifiers
A quantifier after a token or a group specifies how often that preceding element is allowed to occur.
? - 0 or 1 match * - 0 or more + - 1 or more {n} - exactly n {n,} - n or more {,n} - n or less (??) {n,m} - range n to m
The above is a list of common quantifiers.
The question mark ?
indicates there is zero or one of the preceding element.
zeroorone.php
<?php $words = [ "color", "colour", "comic", "colourful", "colored", "cosmos", "coloseum", "coloured", "colourful" ]; $pattern = "/colou?r/"; foreach ($words as $word) { if (preg_match($pattern, $word)) { echo "$word matches the pattern\n"; } else { echo "$word does not match the pattern\n"; } } ?>
We have four nine in the $words
array.
$pattern = "/colou?r/";
Color is used in American English, colour in British English. This pattern matches both cases.
$ php zeroorone.php color matches the pattern colour matches the pattern comic does not match the pattern colourful matches the pattern colored matches the pattern cosmos does not match the pattern coloseum does not match the pattern coloured matches the pattern colourful matches the pattern
This is the output of the zeroorone.php
script.
The *
metacharacter matches the preceding element zero or more times.
zeroormore.php
<?php $words = [ "Seven", "even", "Maven", "Amen", "Leven" ]; $pattern = "/.*even/"; foreach ($words as $word) { if (preg_match($pattern, $word)) { echo "$word matches the pattern\n"; } else { echo "$word does not match the pattern\n"; } } ?>
In the above script, we have added the *
metacharacter. The .*
combination means, zero, one or more single characters.
$ php zeroormore.php Seven matches the pattern even matches the pattern Maven does not match the pattern Amen does not match the pattern Leven matches the pattern
Now the pattern matches three words: Seven, even and Leven.
php > print_r(preg_grep("#o{2}#", ["gool", "root", "foot", "dog"])); Array ( [0] => gool [1] => root [2] => foot )
The o{2}
pattern matches strings that contain exactly two 'o' characters.
php > print_r(preg_grep("#^\d{2,4}$#", ["1", "12", "123", "1234", "12345"])); Array ( [1] => 12 [2] => 123 [3] => 1234 )
We have this ^\d{2,4}$
pattern. The \d
is a character set; it stands for digits. The pattern matches numbers that have 2, 3, or 4 digits.
Alternation
The next example explains the alternation operator |
. This operator enables to create a regular expression with several choices.
alternation.php
<?php $names = [ "Jane", "Thomas", "Robert", "Lucy", "Beky", "John", "Peter", "Andy" ]; $pattern = "/Jane|Beky|Robert/"; foreach ($names as $name) { if (preg_match($pattern, $name)) { echo "$name is my friend\n"; } else { echo "$name is not my friend\n"; } } ?>
We have eight names in the $names
array.
$pattern = "/Jane|Beky|Robert/";
This is the search pattern. The pattern looks for 'Jane', 'Beky', or 'Robert' strings..
$ php alternation.php Jane is my friend Thomas is not my friend Robert is my friend Lucy is not my friend Beky is my friend John is not my friend Peter is not my friend Andy is not my friend
This is the output of the script.
Subpatterns
We can use square brackets ()
to create subpatterns inside patterns.
php > echo preg_match("/book(worm)?$/", "bookworm"); 1 php > echo preg_match("/book(worm)?$/", "book"); 1 php > echo preg_match("/book(worm)?$/", "worm"); 0
We have the following regex pattern: book(worm)?$
. The (worm)
is a subpattern. The ? character follows the subpattern, which means that the subpattern might appear 0, 1 times in the final pattern. The $
character is here for the exact end match of the string. Without it, words like bookstore, bookmania would match too.
php > echo preg_match("/book(shelf|worm)?$/", "book"); 1 php > echo preg_match("/book(shelf|worm)?$/", "bookshelf"); 1 php > echo preg_match("/book(shelf|worm)?$/", "bookworm"); 1 php > echo preg_match("/book(shelf|worm)?$/", "bookstore"); 0
Subpatterns are often used with alternation. The (shelf|worm)
subpattern enables to create several word combinations.
Character classes
We can combine characters into character classes with the square brackets. A character class matches any character that is specified in the brackets.
characterclass.php
<?php $words = [ "sit", "MIT", "fit", "fat", "lot" ]; $pattern = "/[fs]it/"; foreach ($words as $word) { if (preg_match($pattern, $word)) { echo "$word matches the pattern\n"; } else { echo "$word does not match the pattern\n"; } } ?>
We define a character set with two characters.
$pattern = "/[fs]it/";
This is our pattern. The [fs]
is the character class. Note that we work only with one character at a time. We either consider f, or s, but not both.
$ php characterclass.php sit matches the pattern MIT does not match the pattern fit matches the pattern fat does not match the pattern lot does not match the pattern
This is the outcome of the script.
We can also use shorthand metacharacters for character classes. The \w
stands for alphanumeric characters, \d
for digit, and \s
whitespace characters.
shorthand.php
<?php $words = [ "Prague", "111978", "terry2", "mitt##" ]; $pattern = "/\w{6}/"; foreach ($words as $word) { if (preg_match($pattern, $word)) { echo "$word matches the pattern\n"; } else { echo "$word does not match the pattern\n"; } } ?>
In the above script, we test for words consisting of alphanumeric characters. The \w{6}
stands for six alphanumeric characters. Only the word mitt##
does not match, because it contains non-alphanumeric characters.
php > echo preg_match("#[^a-z]{3}#", "ABC"); 1
The #[^a-z]{3}#
pattern stands for three characters that are not in the class a-z. The "ABC" characters match the condition.
php > print_r(preg_grep("#\d{2,4}#", [ "32", "234", "2345", "3d3", "2"])); Array ( [0] => 32 [1] => 234 [2] => 2345 )
In the above example, we have a pattern that matches 2, 3, and 4 digits.
Extracting matches
The preg_match()
takes an optional third parameter. If it is provided, it is filled with the results of the search. The variable is an array whose first element contains the text that matched the full pattern, the second element contains the first captured parenthesized subpattern, and so on.
extract_matches.php
<?php $times = [ "10:10:22", "23:23:11", "09:06:56" ]; $pattern = "/(\d\d):(\d\d):(\d\d)/"; foreach ($times as $time) { $r = preg_match($pattern, $time, $match); if ($r) { echo "The $match[0] is split into:\n"; echo "Hour: $match[1]\n"; echo "Minute: $match[2]\n"; echo "Second: $match[3]\n"; } } ?>
In the example, we extract parts of a time string.
$times = [ "10:10:22", "23:23:11", "09:06:56" ];
We have three time strings in English locale.
$pattern = "/(\d\d):(\d\d):(\d\d)/";
The pattern is divided into three subpatterns using square brackets. We want to refer specifically to exactly to each of these parts.
$r = preg_match($pattern, $time, $match);
We pass a third parameter to the preg_match()
function. In case of a match, it contains text parts of the matched string.
if ($r) { echo "The $match[0] is split into:\n"; echo "Hour: $match[1]\n"; echo "Minute: $match[2]\n"; echo "Second: $match[3]\n"; }
The $match[0]
contains the text that matched the full pattern, $match[1]
contains text that matched the first subpattern, $match[2]
the second, and $match[3]
the third.
$ php extract_matches.php The 10:10:22 is split into: Hour: 10 Minute: 10 Second: 22 The 23:23:11 is split into: Hour: 23 Minute: 23 Second: 11 The 09:06:56 is split into: Hour: 09 Minute: 06 Second: 56
This is the output of the example.
Email example
Next have a practical example. We create a regex pattern for checking email addresses.
emails.php
<?php $emails = [ "luke@gmail.com", "andy@yahoocom", "34234sdfa#2345", "f344@gmail.com"]; # regular expression for emails $pattern = "/^[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\.[a-zA-Z.]{2,18}$/"; foreach ($emails as $email) { if (preg_match($pattern, $email)) { echo "$email matches \n"; } else { echo "$email does not match\n"; } } >?
Note that this example provides only one solution. It does not have to be the best one.
$pattern = "/^[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\.[a-zA-Z.]{2,18}$/";
This is the pattern. The first ^
and the last $
characters are here to get an exact pattern match. No characters before and after the pattern are allowed. The email is divided into five parts. The first part is the local part. This is usually a name of a company, individual, or a nickname. The [a-zA-Z0-9._-]+
lists all possible characters, we can use in the local part. They can be used one or more times. The second part is the literal @
character. The third part is the domain part. It is usually the domain name of the email provider, like yahoo, or gmail. The [a-zA-Z0-9-]+
is a character set providing all characters that can be used in the domain name. The +
quantifier makes use of one or more of these characters. The fourth part is the dot character. It is preceded by the escape character (\). This is because the dot character is a metacharacter and has a special meaning. By escaping it, we get a literal dot. The final part is the top level domain. The pattern is as follows: [a-zA-Z.]{2,18}
Top level domains can have from 2 to 18 characters, like sk, net, info, travel, cleaning, travelinsurance. The maximum lenght can be 63 characters, but most domain are shorter than 18 characters today. There is also a dot character. This is because some top level domains have two parts; for example co.uk.
$ php emails.php luke@gmail.com matches andy@yahoocom does not match 34234sdfa#2345 does not match f344@gmail.com matches
This is the output of the emails.php
example.
Recap
Finally, we provide a quick recap of the regex patterns.
Jane the 'Jane' string ^Jane 'Jane' at the start of a string Jane$ 'Jane' at the end of a string ^Jane$ exact match of the string 'Jane' [abc] a, b, or c [a-z] any lowercase letter [^A-Z] any character that is not a uppercase letter (Jane|Becky) Matches either 'Jane' or 'Becky' [a-z]+ one or more lowercase letters ^[98]?$ digits 9, 8 or empty string ([wx])([yz]) wy, wz, xy, or xz [0-9] any digit [^A-Za-z0-9] any symbol (not a number or a letter)
In this chapter, we have covered regular expressions in PHP.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论