Making a regexp to retrieve href/src of html tags

45 Views Asked by At

Im trying to get the href from links and src from script tags. Since it's my first time using regex, im isuing quite not enough understand on how groups, negation and conditional selection works.

This is what i've got so far

(\/|\.\/)((^([http|https]:\/\/))*.*?\w+)([a-zA-Z]\w+)(?=\.(js|css|ts)\b)(.*?\w+)

Working in this cases:

<script src="./script.js"/>
<script src="/script.js"/>
<script src="/my/custom/dir/script.js"/>
<!--Theme same for links-->

Non functioning cases:

<script src="https://my.cdn.fav/script.js"/>
<script src="http://my.cdn.fav/script.js"/>

Or well it does consider the starting http and https but doesnt select them. It's this part that is doing something wrong (\/|\.\/)((^([http|https]:\/\/))

1

There are 1 best solutions below

2
guest271314 On

Given your input string you can replace everything you are not trying to match with something like .replace(/<script\ssrc="|"\/>/g, " ").split(/[\n\s]+/).filter(Boolean)

let matches = `<script src="https://my.cdn.fav/script.js"/>
<script src="http://my.cdn.fav/script.js"/><script src="./script.js"/>
<script src="/script.js"/>
<script src="/my/custom/dir/script.js"/>`.replace(/<script\ssrc="|"\/>/g, " ").split(/[\n\s]+/).filter(Boolean);

console.log(matches);