Regular expression – PHP

Regular expression is when you have a match condition and a string to search through. Here is a cheat sheet to download to view regular expressions syntax.

The regular matching method in php that I use is preg_match_all

int preg_match_all ( string pattern, string subject, array &matches [, int flags [, int offset]] )

which basically means that it will return a integer value of how many matches it has found within the subject text. The flags and offset are optional values.

Here is a example to pull out HTML A links within a text, A links start with <a and have some optional parameters e.g. href, title >text to display on the screen</a> (the closing tag. So in xml speak as such

<a[optional attributes]>value</a>

So the basics of searching for HTML A links within a string is to look for “” which then has the value and then the closing tag . So to put that in regular expression talk, to break it down we first search for the start <a followed by x amount of characters \s.* with a ending tag of <\/a> the “\” is there because the / is a condition statement for regular expression. And with this together it would look like

<a\s.*<\/a>

But because <\/a> happens twice in the string and either one can be the end point of the search since we are searching string values “\s.*” 0-x amount of them. \s means white space character “.” means any character that is not \n and “*” means 0 – more times of the previous check. So there is potential for a problem here we are not telling the regular expression to stop at the first < character and to this we use the [] to match some characters, the ^ in this context means any character not in this list about to come and "<" is the stopping character. So the full regular expression will be.

<a\s.[^<]*<\/a>

Here is the php code that will display all of the HTML A links from a string.

<?php
// here is a string with 2 a href links inside it and sounding text.
$matchAStr = "coding friends is my site :) <a href=\"http://www.codingfriends.com\">codingfriends.com</a> hi there.. " . 
	    " mountains are nice. <a href=\"http://www.norfolkhospice.org.uk/\">norfolkhospice.org.uk</a>" . 
	    " support the tapping house in Norfolk";
 
// the preg_match_all uses regular expression to match and pull out strings, $matches is the matches found.
 
$matched = preg_match_all("/<a\s.[^<]*<\/a>/",$matchAStr,$matches);
echo "There was $matched matche(s)\n";
 
echo "<ol>\n";
foreach($matches[0] as $url)
{
  echo "<li>".$url."</li>\n";
}
echo "</ol>\n";
?>

There was 2 matche(s)
<ol>
<li><a href="http://www.codingfriends.com">codingfriends.com</a></li>
<li><a href="http://www.norfolkhospice.org.uk/">norfolkhospice.org.uk</a></li>
</ol>

Leave a Reply