Scrapy - how to convert string into an object which I can use XPath on?

2316 views xpath
10

Let's say I have some plain text in HTML-like format like this: <div id="foo"><p id="bar">Some random text</p></div>. And I need to be able to run XPath on it to retrieve some inner element. How can I convert plain text to some kind of object which I could use XPath on?

answered question

Are you searching for Scrapy solution only?

I am looking for a solution that works with Scrapy. But it doesn't have to be Scrapy only.

1 Answer

2

You can pass HTML code sample as string to lxml.html and parse it with XPath:

from lxml import html

code = """<div id="foo"><p id="bar">Some random text</p></div>"""
source = html.fromstring(code)
source.xpath('//div/p/text()')

posted this

Have an answer?

JD

Please login first before posting an answer.