WTF!? preg_replace() returns null?
On one of our sites were were running into a problem when we tried to pass HTML content from a database through an email obfuscation function to prevent spiders from scraping our clients’ email addresses. We quickly discovered that some of the longer pages were showing up completely blank. The preg_replace() function we were using to run the obfuscation code on email addresses was returning null. After some hunting I found the answer.
In PHP 5.2, Perl Compatible Regular Expressions (PCRE) introduced with little fanfare a PHP setting called backtrack_limit, which, for the first time, set a limit on the number of backtracks a regular expression could perform before it stops operating and reports an error. Unfortunately, when PCRE encounters an error of this type, it doesn’t report a notice or warning or error. All it does is return NULL, something that the preg family of functions typically never does. There were a lot of entries on the PHP.net site reporting this behavior as a bug, and sites that are regex heavy (like Wiki sites) scrambled to figure out WTF was going on.
The only way to actually determine that this type of PCRE error took place in your code is to call preg_last_error() after you’ve tried to run your regex. Of course, before PHP 5.2, backtrack errors were handled much more gracefully (if they were even triggered), by returning the original string that was passed to the regex function.
To get around this backtrack limit, if you’re running regex on large pages (or really long strings) is to increase the backtrack limit in your PHP.ini settings. I increased ours from 100,000 to 1,000,000. Of course, you still run the risk of producing an error on really, really long strings, and that’s why a second step you should take is to add better error handling any place where you might run a PCRE function on a really long string. Should an error be produced, it’s up to you how to handle it, whether that be returning the original string, or breaking your string up into smaller pieces and running them separately.
Ultimately the best thing one can do (and should always do) is optimize your regex as much as possible, and for some people that just means knowing when to use regex and when a simple str_replace() will suffice.
Tags: PCRE, perl, php, preg_replace, regex, regular expressions










October 6th, 2008 at 9:14 am
Was debugging like mad in Typo3 till i haunted this down
October 6th, 2009 at 10:46 pm
thank you as well – you made my day! I already searched the hell out of myself…
January 13th, 2009 at 1:13 pm
April 26th, 2009 at 6:03 pm
July 7th, 2009 at 5:09 am
July 23rd, 2009 at 6:18 am
December 23rd, 2009 at 2:33 am
March 19th, 2010 at 4:02 am
[...] WTF!? preg_replace() returns null? Pelago :: web design, web development blog :: [...]
May 14th, 2010 at 9:00 am
August 15th, 2011 at 10:29 am
February 6th, 2012 at 9:22 am