PHP preg_match performance and overcoming maximum limit
Recently, I ran into a problem with PHP PCRE regex matching for no apparent reason:
preg_match("@ <b>(.*?)</b> .* <p>(.*?)</p> @isx", $d, $m);
After some debugging it turned out that PHP doesn’t like regex matches that are longer than a couple of kilobytes. In this case, if there are a couple 100 lines between the closing </b>
and the next <p>
it will cause preg_match
to return without completing, resulting in 0 matches. The fix is to use preg_match_all
and replace .*
with a |
(logical OR):
preg_match_all("@ <b>(.*?)</b> | <p>(.*?)</p> @isx", $d, $m);
However, this is considerably slower (5x slower in my case). It turns out that the simplest approach is also the fastest (faster than both the above).
preg_match("@ <h1>(.*?)</h1> @isx", $d, $m1); preg_match("@ <p>(.*?)</p> @isx", $d, $m2);
So it’s better to use the last approach when the regex matches a very long string and has multiple matches. Avoid preg_match_all
unless you really need it, like for matching all links in a document.
Well I try to avoid regex as much as possible, the reason behind it is that they are heavy and use more CPU.
Thanks for this. I just spent 45 minutes wondering why preg_match_all() started failing on a much bigger string- with no indication! Argh.
if(strlen($subject)>7000)
ini_set(‘pcre.backtrack_limit’,’200000′);
This is just a wild observation, but I guess you’ll have to find an optimal equation for increasing the limit. Don’t get too happy thou, you could simply crash php by setting the limit too high.
http://docs.php.net/manual/en/pcre.configuration.php
how can i limit the values of match?
ex: preg_match(‘/[.,]{1}/’,$string);
where {1} is the limit, and if is more then 1 then i can do something with it