I have DataFrame in Python Pandas like below:
sentence
------------
I like it
+1
One :-) :)
hah
I need to select only rows containing emoticons or emojis, so as a result I need something like below:
sentence
------------
+1
One :-) :)
How can I do that in Python ?
You can select the unicode emojis with a regex range:
output:
This is however much more ambiguous for the ASCII "emojis" as there is no standard definition and probably endless combinations. If you limit it to the smiley faces that contain eyes ';:' and a mouth ')(' you could use:
output:
But you would be missing plenty of potential ASCII possibilities:
:O,:P,8D, etc.