XPath get text between two "p" tags

40 Views Asked by North Legion At 09 January 2024 at 14:08

I can't get the text between <p> text </p>

now I have:

//span[@class='_39I1Z _2ITlL _3D3LC _3NGVr embTL _1zNVc']/span/p

It looks into the span and lists the "p" tags, but what to do next?

structure:

<div class="" xpath="1">
<span class="_39I1Z _2ITlL _3D3LC _3NGVr embTL _1zNVc">
    <span>
        <p>Text</p>
        <p>Text</p>
        <p>Text</p>
        <p><b>Text:</b></p>
        <p></p>
            <ul>
                <li>Text</li>
                <li>Text</li>
                <li>Text</li>
            </ul>
    </span>
</span>

tried various combinations. but seems confused

Original Q&A

There are 1 best solutions below

North Legion On 10 January 2024 at 09:27

I conquered this problem.

The spider was looking at the wrong page
description = cleanhtml(response.xpath("//span[@class='_39I1Z _1KwXc _3kzFG _2zux5 lU8Yn _1o3OU']/span").getall())
because description comes dirty (with html) I cleaned it:

import re

def cleanhtml(raw_html):
    return re.sub(re.compile('<.*?>'), '', str(raw_html))

XPath get text between two "p" tags

There are 1 best solutions below

Related Questions in PARSING

Related Questions in XPATH

Related Questions in HTML-PARSING

Trending Questions

Popular # Hahtags

Popular Questions