Regular expression is when you have a match condition and a string to search through. Here is a cheat sheet to download to view regular expressions syntax.
The regular matching method in php that I use is preg_match_all
int preg_match_all ( string pattern, string subject, array &matches [, int flags [, int offset]] ) |
which basically means that it will return a integer value of how many matches it has found within the subject text. The flags and offset are optional values.
Here is a example to pull out HTML A links within a text, A links start with <a and have some optional parameters e.g. href, title >text to display on the screen</a> (the closing tag. So in xml speak as such
<a[optional attributes]>value</a> |
So the basics of searching for HTML A links within a string is to look for “” which then has the value and then the closing tag . So to put that in regular expression talk, to break it down we first search for the start <a followed by x amount of characters \s.* with a ending tag of <\/a> the “\” is there because the / is a condition statement for regular expression. And with this together it would look like
<a\s.*<\/a> |
But because <\/a> happens twice in the string and either one can be the end point of the search since we are searching string values “\s.*” 0-x amount of them. \s means white space character “.” means any character that is not \n and “*” means 0 – more times of the previous check. So there is potential for a problem here we are not telling the regular expression to stop at the first < character and to this we use the [] to match some characters, the ^ in this context means any character not in this list about to come and "<" is the stopping character. So the full regular expression will be.
<a\s.[^<]*<\/a> |
Here is the php code that will display all of the HTML A links from a string.
<?php // here is a string with 2 a href links inside it and sounding text. $matchAStr = "coding friends is my site :) <a href=\"http://www.codingfriends.com\">codingfriends.com</a> hi there.. " . " mountains are nice. <a href=\"http://www.norfolkhospice.org.uk/\">norfolkhospice.org.uk</a>" . " support the tapping house in Norfolk"; // the preg_match_all uses regular expression to match and pull out strings, $matches is the matches found. $matched = preg_match_all("/<a\s.[^<]*<\/a>/",$matchAStr,$matches); echo "There was $matched matche(s)\n"; echo "<ol>\n"; foreach($matches[0] as $url) { echo "<li>".$url."</li>\n"; } echo "</ol>\n"; ?> |
There was 2 matche(s) <ol> <li><a href="http://www.codingfriends.com">codingfriends.com</a></li> <li><a href="http://www.norfolkhospice.org.uk/">norfolkhospice.org.uk</a></li> </ol> |