Javascript Regex to convert _xxx_ to [i]xxx[i] ignoring any text between <> (URLs) or `` (code)

731 Views Asked by At

I'm trying to reformat Slack formatting to bbcode and need a little help. Slack does italics like this:

_this is italic_ and this isn't

My current expression (/\_([^\_]*)\_/gm) works but unfortunately picks up underscores in URLs and inside code snippets. Slack formats URLs and code like this:

<www.thislink.com|here's a link>
`here's a code snippet`

How can I tell regex not to match any underscore pairs inside a link or code snippet? I've been trying negative lookahead and lookbehind but without success.

1

There are 1 best solutions below

0
Wiktor Stribiżew On

You need to match and capture what you need and just match what you do not need.

Once you get a match, analyze it and implement the appropriate code logic:

const re = /<[^<>|]*(?:\|[^<>]*)?>|`[^`]*`|_([^_]*)_/g;
const text = "<www.thislink.com|here's a link>\n`here's a code snippet`\n_this is italic_ and this isn't";
console.log( text.replace(re, (m,g) => g !== undefined ? "[i]" + g + "[/i]" : m ) )

See the regex demo. Details:

  • <[^<>|]*(?:\|[^<>]*)?> - a <, then zero or more chars other than <, > and |, then an optional sequence of a | and then zero or more chars other than < and > and then a > char
  • | - or
  • `[^`]*` - a backtick, zero or more chars other than a backtick and a backtick
  • | - or
  • _([^_]*)_ - _, Group 1: zero or more chars other than _, a _.