Using BeautifulSoup to extract text from div

3401 views python
8

I am using the following snippet and attempting to parse a section of html from the link below, namely the div appears like:

<div id="avg-price" class="price big-price">4.02</div>
<div id="best-price" class="price big-price">0.20</div>
<div id="worst-price" class="price big-price">15.98</div>

This is the code that I am attempting to use

import requests, urllib.parse
from bs4 import BeautifulSoup, element
r = requests.get('https://herf.io/bids?search=tatuaje%20tattoo')
soup = BeautifulSoup(r.text, 'html.parser')

avgPrice = soup.find("div", {"id": "avg-price"})
lowPrice = soup.find("div", {"id": "best-price"})
highPrice = soup.find("div", {"id": "worst-price"})

print(avgPrice)
print(lowPrice)
print(highPrice)
print("Average Price: {}".format(avgPrice))
print("Low Price: {}".format(lowPrice))
print("High Price: {}".format(highPrice))

However, it does not include the price between the divs... the result looks like:

<div class="price big-price" id="avg-price"></div>
<div class="price big-price" id="best-price"></div>
<div class="price big-price" id="worst-price"></div>
Average Price: <div class="price big-price" id="avg-price"></div>
Low Price: <div class="price big-price" id="best-price"></div>
High Price: <div class="price big-price" id="worst-price"></div>

Any ideas? I'm sure i'm overlooking something small but i'm at wits end right now haha.

answered question

Selenium because javascript

The values are generated by executing some js codes, and hasn't been included in r.text. If you can only use requests, make all the same requests as a browser does.

3 Answers

13

You can strip out the text by the text attribute:

print("Average Price: {}".format(avgPrice.text))
print("Low Price: {}".format(lowPrice.text))
print("High Price: {}".format(highPrice.text))

posted this
11

Try

avgPrice[0].text 

For the rest, do the same.

posted this
4

You can use the text attribute

In [644]: import bs4

In [645]: s = """<div id="avg-price" class="price big-price">4.02</div>
     ...: <div id="best-price" class="price big-price">0.20</div>
     ...: <div id="worst-price" class="price big-price">15.98</div>"""

In [646]: soup = bs4.BeautifulSoup(s, 'html.parser')

In [649]: avgPrice = soup.find("div", {"id": "avg-price"})
     ...: lowPrice = soup.find("div", {"id": "best-price"})
     ...: highPrice = soup.find("div", {"id": "worst-price"})
     ...:
     ...: print(avgPrice)
     ...: print(lowPrice)
     ...: print(highPrice)
     ...: print("Average Price: {}".format(avgPrice.text))
     ...: print("Low Price: {}".format(lowPrice.text))
     ...: print("High Price: {}".format(highPrice.text))
     ...:
<div class="price big-price" id="avg-price">4.02</div>
<div class="price big-price" id="best-price">0.20</div>
<div class="price big-price" id="worst-price">15.98</div>
Average Price: 4.02
Low Price: 0.20
High Price: 15.98

posted this

Have an answer?

JD

Please login first before posting an answer.

Ads

Categories