Dangling metacharacter * sparksql

784 Views Asked by At

Below regex works in Hive but not in Spark.

It throws an error dangling metacharacter * at index 3:

select regexp_extract('a|b||c','^(\\|*(?:(?!\\|\\|\\w(?!\\|\\|)).)*)');

I also tried escaping * with \\* but still it throws dangling metacharacter * at index 3.

2

There are 2 best solutions below

0
Wiktor Stribiżew On BEST ANSWER

You can use

regexp_replace(col, '^(.*)[|]{2}.*$', '$1')

See the regex demo.

Regex details:

  • ^ - start of string
  • (.*) - Capturing group 1 (this group value is referred to with $1 replacement backreference in the replacement pattern): any zero or more chars other than line break chars, as many as possible (the rest of the line)
  • [|]{2} - double pipe (|| string)
  • .* - the rest of the line
  • $ - end of string.
0
Rajesh On

This worked for me:

regexp_replace("***", "\\\*", "a")