Geek Alert!
I’m pretty clueless when it comes to regular expressions, although I’m finally beginning to learn how to use them effectively. Here’s my latest problem, for which any suggestions from regex experts would be greatly appreciated.
Occasionally someone will post a comment containing a very long string of unbroken characters, which causes the comments area to expand, and thoroughly wrecks the display. I’d like to have a regular expression (PHP style) that checks comments for long strings, inserts spaces as appropriate (say, on a 40-character boundary), yet leaves any long strings inside HTML tags alone (because hyperlinks can be longer without causing a problem).
Any solutions, oh mighty hordes of regex-savvy lizardoids?
UPDATE: Thanks to all for the suggestions and offers to help; reader Doug Stewart found a piece of code at php.net that solves the problem nicely. Here’s the final result, which breaks any word longer than 40 characters by inserting a space, but ignores long words inside HTML tags:
$htmltext = stripslashes (preg_replace (‘%(\s*)([^>]{‘.$wrap_at.’,})(<|$)%e’, “’\1’.wordwrap(‘\2’, ‘”.$wrap_at.”’, ’ ‘, 1).’\3’”, $htmltext));