WTF!? preg_replace() returns null?

On one of our sites were were running into a problem when we tried to pass HTML content from a database through an email obfuscation function to prevent spiders from scraping our clients’ email addresses. We quickly discovered that some of the longer pages were showing up completely blank. The preg_replace() function we were using to run the obfuscation code on email addresses was returning null. After some hunting I found the answer.

In PHP 5.2, Perl Compatible Regular Expressions (PCRE) introduced with little fanfare a PHP setting called backtrack_limit, which, for the first time, set a limit on the number of backtracks a regular expression could perform before it stops operating and reports an error. Unfortunately, when PCRE encounters an error of this type, it doesn’t report a notice or warning or error. All it does is return NULL, something that the preg family of functions typically never does. There were a lot of entries on the PHP.net site reporting this behavior as a bug, and sites that are regex heavy (like Wiki sites) scrambled to figure out WTF was going on.

The only way to actually determine that this type of PCRE error took place in your code is to call preg_last_error() after you’ve tried to run your regex. Of course, before PHP 5.2, backtrack errors were handled much more gracefully (if they were even triggered), by returning the original string that was passed to the regex function.

To get around this backtrack limit, if you’re running regex on large pages (or really long strings) is to increase the backtrack limit in your PHP.ini settings. I increased ours from 100,000 to 1,000,000. Of course, you still run the risk of producing an error on really, really long strings, and that’s why a second step you should take is to add better error handling any place where you might run a PCRE function on a really long string. Should an error be produced, it’s up to you how to handle it, whether that be returning the original string, or breaking your string up into smaller pieces and running them separately.

Ultimately the best thing one can do (and should always do) is optimize your regex as much as possible, and for some people that just means knowing when to use regex and when a simple str_replace() will suffice.

Tags: , , , , ,
Bookmark: Post to Del.icio.us Post to Digg Post to Google Post to Ma.gnolia Post to MyWeb Post to Newsvine Post to Reddit Post to Simpy Post to Slashdot Post to Technorati

11 Responses to “WTF!? preg_replace() returns null?”

  1. Mojo Says:

    Was debugging like mad in Typo3 till i haunted this down

  2. phpwutz Says:

    thank you as well – you made my day! I already searched the hell out of myself…

  3. Re: [TYPO3-german] Umstellung PHP4 / PHP5 Page generation - Datenvolumen Says:

    Kramer auto Pingback[...] was so viel heist das $TSFE->content leer ist -> blank page hier ist ein guter Link dazu: http://www.pelagodesign.com/blog/2008/01/25/wtf-preg_replace-returns-null/#comment-6217 ihr müsst das "backtrack_limit" in der php.ini hochsetzen um den Fehler zu umgehen. [...]

  4. Web design & development: Should you charge for time spent in meetings? | Project management and time tracking blog for web designers and small business :: the Intervals Blog by Pelago Says:

    Kramer auto Pingback[...] Pelago Blog :: web development links and random tidbits from the creative minds of Team Pelago Says: January 27th, 2009 at 5:29 am [...]

  5. support.TYPO3.org: german Says:

    Kramer auto Pingback[...] was so viel heist das $TSFE->content leer ist -> blank page hier ist ein guter Link dazu: http://www.pelagodesign.com/blog/2008/01/25/wtf-preg_replace-returns-null/#comment-6217 ihr müsst das "backtrack_limit" in der php.ini hochsetzen um den Fehler zu umgehen. [...]

  6. Nabble - TYPO3 German - [TYPO3-german] Bei sehr viel Content (mit vielen Links) wird Rendering abgebrochen! Says:

    Kramer auto Pingback[...] was so viel heist das $TSFE->content leer ist -> blank page hier ist ein guter Link dazu: http://www.pelagodesign.com/blog/2008/01/25/wtf-preg_replace-returns-null/#comment-6217  du muesst das "backtrack_limit" in der php.ini hochsetzen um den Fehler zu umgehen. [...]

  7. TYPO3.net - USER_INT gibt nichts aus, wenn $content zu lange wird Says:

    Kramer auto Pingback[...] [...]

  8. No Agenda 183 « MLManley Says:

    [...] WTF!? preg_replace() returns null? Pelago :: web design, web development blog :: [...]

  9. Les limites serveur du preg_replace - Webmaster Hub Says:

    Kramer auto Pingback[...] [...]

  10. regex - php preg_replace returning null - Stack Overflow Says:

    Kramer auto Pingback[...] information is available here: http://www.pelagodesign.com/blog/2008/01/25/wtf-preg_replace-returns-null/ or on [...]

  11. How do I change words in post texts? - vBulletin.org Forum Says:

    Kramer auto Pingback[...] str_replace can't. I've searched for the preg_replace returning NULL issue, and found this link: http://www.pelagodesign.com/blog/200…-returns-null/ In PHP 5.2, Perl Compatible Regular Expressions (PCRE) introduced with little fanfare a PHP [...]

Leave a Reply