Extract values from HTML when parent div contains a specific word (multi-nested divs)

169 Views Asked by At

I copy the HTML of a "multi-select" list from a page which looks like that: enter image description here and then paste the HTML version (after beautifying it online) in a notepad++ page.

I know want to use Regex in order to extract the lines that are enabled in that list. In other words, I want to see what options I had selected from that dropdown. There are many lines and it is impossible to scroll and find them all. So, the best way in my mind is to use that HTML and search for the divs that contain "enabled". Then, the inner divs should have the values that I am looking for.

The HTML is shown below:

       <div class="ui-multiselect-option-row" data-value="1221221111">
      <div class="ui-multiselect-checkbox-wrapper">
         <div class="ui-multiselect-checkbox"></div>
      </div>
      <div class="ui-multiselect-option-row-text">(BASE) OneOneOne (4222512512)</div>
   </div>
   <div class="ui-multiselect-option-row ui-multiselect-option-row-selected" data-value="343333434334">
      <div class="ui-multiselect-checkbox-wrapper">
         <div class="ui-multiselect-checkbox"></div>
         <div class="ui-multiselect-checkbox-selected">✔</div>
      </div>
      <div class="ui-multiselect-option-row-text">(BASE) TwoTwoTwo (5684641230)</div>
   </div>

The outcome should return the following value only (based on the above): (BASE) TwoTwoTwo (5684641230)

So far, I have tried using the following regex in notepad++:

<div class="ui-multiselect-option-row ui-multiselect-option-row-selected"(.*?)(?=<div class="ui-multiselect-option-row")

but it is impossible to mark all the lines at the same time and remove the unmarked ones. Notepad++ only marks the first line of the entire selection. So, I am thinking whether there is a better way - a more complex regex that can parse the value directly. So, in lines:

a) I either want to make the above work with another regex line in notepad++ (I am open to visualstudio if that makes it faster)

b) Or an easier way using the console in Chrome to parse the selected values. I would still like to see the regex solution but for Chrome console I have an

Update 1:

I used this line $('div.ui-multiselect-option-row-selected > div:nth-child(2)') and all I need know, as I am not that familiar with the Chrome console export, is to get the innerHTML from the following lines: enter image description here

Update 2:

for (var b in $('div.ui-multiselect-option-row-selected > div:nth-child(2)')){
    console.log($('div.ui-multiselect-option-row-selected > div:nth-child(2)')[b].innerHTML);

which works and I now only have to export the outcome }

2

There are 2 best solutions below

8
MonkeyZeus On BEST ANSWER

Open up Chrome's Console tab and execute this:

$x('//div[contains(@class, "ui-multiselect-option-row-selected")]/div[@class="ui-multiselect-option-row-text"]/text()')

Here is how it should look using your limited HTML sample but duplicated.

enter image description here

If you have multiple multi-selects and no unique identifier then count which one you need to target (notice the [1]):

    $x('//div[contains(@class, "ui-multiselect-option-row-selected")][1]/div[@class="ui-multiselect-option-row-text"]/text()')
3
Chase On

All you have to do is use css selectors followed by a .map to get all the elements' innerHTML in a list

[...$('div.ui-multiselect-option-row-selected > div:nth-child(2)')].map(n => n.innerHTML)

The css selector is div.ui-multiselect-option-row-selected > div:nth-child(2) - which, as I've already mentioned in my comment, selects the 2nd immediate child of all divs with the ui-multiselect-option-row-selected class.

Then we just use some javascript to turn the result into a list and do a map to extract all the innerHTML. As you asked.

If the list is sufficiently big, you might consider storing the result of [...$('div.ui-multiselect-option-row-selected > div:nth-child(2)')].map(n => n.innerHTML) in a variable using

const e = [...$('div.ui-multiselect-option-row-selected > div:nth-child(2)')].map(n => n.innerHTML);

and then doing

copy(e);

This will copy the list into your clipboard, wherever you use ctrl + v now - you'll end up pasting the list.