Is there a way to prevent tab characters from being filtered with the FILTER_FLAG_STRIP_LOW option?

250 Views Asked by At

Is their a regular expression or a better way to remove the same characters that the php filter_var function's FILTER_SANITIZE_STRING filter FILTER_FLAG_STRIP_LOW option does, except the tab character?

See https://www.php.net/manual/en/filter.filters.sanitize.php for the FILTER_SANITIZE_STRING filter and FILTER_FLAG_STRIP_LOW option.

I'm using the php filter_var function with the FILTER_SANITIZE_STRING filter and FILTER_FLAG_NO_ENCODE_QUOTES, FILTER_FLAG_STRIP_LOW, FILTER_FLAG_STRIP_HIGH options to remove quite a few characters and 'word'-patterns, but it also removes tab characters, which I don't want removed.

If there is a good way to remove the 'low' characters other than tab, then I can remove the FILTER_FLAG_STRIP_LOW filter option from the filter_var call and 'feed' the result to regular express substitution 'filter' or better, and 'finish' the validation.

1

There are 1 best solutions below

7
jspit On

Yes, with preg_replace you can remove the 'low' characters other than tab (\x09).

$input = "ab \x00\x01\t\r\nAcd0\x19";
$santtizeString = preg_replace('~[\x00-\x08\x0A-\x1F]~u','',$input);

The hexadecimal notation of $ input is

\x61\x62\x20\x00\x01\x09\x0d\x0a\x41\x63\x64\x30\x19

All characters with a code <\ x20 except tab \ x09 are removed. The result is a hex notation

\x61\x62\x20\x09\x41\x63\x64\x30

or as string "ab \tAcd0".