HTMLpurifier remove cc email between tags

35 Views Asked by At

I'm using HTMLpurifier to sanitize inputs in my PHP application:

I have CC and BCC inputs something like:

Test Admin <[email protected]>

when I do purify this string, it only keeps: Test Admin (without email between tags)

Please advise!

1

There are 1 best solutions below

0
pinkgothic On

This probably means you've got an escaping issue somewhere in your pipeline before HTML Purifier - something that's putting E-Mail text into an HTML context without HTML escaping it. HTML you need to purify that looks like this:

<p><label for "id">Email</label><span id="email">Test Admin <[email protected]></span></p>

...should really look like this:

<p><label for "id">Email</label><span id="email">Test Admin &lt;[email protected]&gt;</span></p>

If you have no control over the step that's inserting data into the HTML you ultimately want to purify before displaying, you can use this to preprocess your HTML before feeding it to HTML Purifier:

$htmlWithEmail = preg_replace('/<([^<>@]*@[^<>@]*)>/', '&lt;${1}&gt;', $htmlWithEmail);

On the other hand - and I mention this because I know only a little bit about your use-case right now - if you're not actually trying to preserve HTML, if the string you're purifying is literally just Test Admin <[email protected]> with nothing else (unlike the example I crafted above), htmlspecialchars() should be your weapon of choice when outputting into HTML, not HTML Purifier.

HTML Purifier's purpose is not all-purpose data sanitation, it really does exist only for the use-case where you have HTML, you want to preserve it if it's well-behaved, and then output it as HTML. You can find some more info about escaping for context here: https://stackoverflow.com/a/37641037/245790