This article will focus on PHP’s regular expression functions with pattern modifiers. Specifically these functions are of the preg type, and have preg as the prefix. preg_match() is probably the most basic regular expression function of that type and some examples of it will be used in this article.
Regular expressions provide versatile ways of searching text or ‘matching’ it. A string called a pattern is used, similar to a search phrase, and in this case the patterns take a form that is basically the same a the expressions in the Perl language.
$match = preg_match ("/Hypertext/", "PHP: Hypertext Preprocessor"); //match will be a 1 because it did match part of the string
The pattern is the first parameter shown. The second parameter is the text, a string that we are trying to search or match to a pattern.Â /Hypertext/ is the full pattern and the beginning and trailing slash are a required part of patterns. The match is just the word Hypertext, and it will only match that string and case exactly because we did not tell it to be case insensitive.
A useful tool in patterns is the * character. It is often combined with the . character when you simply want the pattern to match everything. That part of a pattern is often called a wildcard.
$match = preg_match ("/Paris.*France/", "Paris, France"); //match will be a 1, it matched the whole string $match_b = preg_match ("/paris .*France/", "Paris, France"); //match will be a 0 because the full pattern could not make a match
As seen here, the whole pattern must match, though the example for match_b was close and had a wildcard, the wildcard was limited by the rest of the pattern. The pattern .* alone can match any string of characters, including a blank string. There is a way to make the pattern require at least one character to match, change the * to a + ( .+ matches “a” but not “” ).
If you weren’t familiar with any of those concepts you may need to go through the basics of regular expressions in the PHP manual. One thing you may be wondering from the examples is how can you make the whole pattern insensitive to capitalization, ie. case insensitive. The answer is pattern modifiers.
$match_b = preg_match ("/paris.*france/i", "Paris, France"); //match will be a 1
Normally it would not match but with an i after the last / it doesn’t matter what case is used for any letter. That is how pattern modifiers work, they affect the whole string and can be combined by placing multiple letters after the last / character.
$place = "Atlanta , Georgia"; $match = preg_match ("/atlanta.*georgia/i", $place); //match will be a 0 $match_b = preg_match ("/atlanta.*georgia/is", $place); //match will be a 1
In the last few examples we used a wildcard in the middle. It would be useful in a situation where the user typed in a location and we can’t predict the separating characters. The problem with the first match here is it fails because there was a newline. It succeeds in the second because of the s after the i in the pattern modifiers. The s makes the . character match newlines and any other character, where normally newlines are not matched by . or .* in patterns.
With .* matching is such a broad way how much would it match if it found 2 possible results. To see this clearly we will use preg_replace not preg_match, and we will show how a particular pattern modifier can give use more control.
$place = "Brad's USA Furniture Atlanta Georgia USA 2250"; echo preg_replace("/Brad's.*USA/is", "---", $place); //you will see --- 2250 echo preg_replace("/Brad's.*USA/isU", "---", $place); //you will see the string --- Furniture Atlanta Georgia USA 2250
We get different results with the greedy and ungreedy behaviors of the wildcard. The capital U makes it ungreedy by default, basically making it replace the smaller portion even if it matches both a small and large part of the string. This concludes the section on pattern modifiers, even though we didn’t get to them all, their syntax and usefulness has been demonstrated.