DB: MariaDB 10.2
Why is this simple regexp matching emoji when the emoji is 4 bytes long. Shouldn't it just match question mark character?
([email protected]:3306) [test]> select '' RLIKE '^[?]+$';
+-----------------------------------+
| '\xF0\x9F\x98\x83' RLIKE '^[?]+$' |
+-----------------------------------+
| 1 |
+-----------------------------------+
1 row in set (0,00 sec)
([email protected]:3306) [test]> SHOW VARIABLES LIKE 'collation%';
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | utf8mb4_general_ci |
| collation_database | utf8mb4_general_ci |
| collation_server | utf8mb4_general_ci |
+----------------------+--------------------+
3 rows in set (0,00 sec)
I can replicate on 10.6.12 with:
I thought it might be related to this issue:
https://jira.mariadb.org/browse/MDEV-11777?jql=project%20%3D%20MDEV%20AND%20text%20~%20%22regexp%22
but the response to danblack's MDEV-32904 indicates it may be happening because of a mismatch in your
character_set%
variables. For example: