Python Beautiful Soup scrape page containing JSP/JS

1715 views javascript
0

i am trying to scrape the price from this page : url = https://www.renodepot.com/en/steph-round-base-shower-kit-69375118

the price information is given in the span tag and I am not able to scrape it. the simple code which I am using for this is

from requests import get
from bs4 import BeautifulSoup
response = get(url)
html_soup = BeautifulSoup(response.text, 'html.parser')
ProductPrice = html_soup.find('div',class_ = 'product_price_wrapper')

but this returns nothing, I think

 BEGIN RenoProdDetailPriceSnippet.jsp 

which appears just above the price div tab is causing the information to be protected.

I even tried doing it with selenium but was not successful. I tried many other combination to get the price but was not able to get the same.

So, I am looking for some ideas to solve this. Thanks

answered question

You cannot scrape dynamically generated pages with requests. Use selenium or a similar web driver.

cur.execute("CREATE TABLE IF NOT EXISTS book (id INTEGER PRIMARY KEY AUTOINCREMENT,title text,author text,year int,isbn int)"). Also it's a good practice to state what values you're inserting so cur.execute("INSERT INTO book (title, author, year, isbn) VALUES (?,?,?, ?)",(title,author,year,isbn))

2 Answers

12

You cannot scrape the page because it requires the completion of a reCAPTCHA to access. This is specifically designed to stop bots.

If you examine html_soup you will find that you are actually searching the reCAPTCHA page, not the desired product page.

posted this
11

First SQLLITE recommends that you not use auto increment as your primary, you should select fields that will define a unique record whenever possible. Second the data type you are passing in is “int” and requires the autoincrement keyword following primary key. Third you should avoid using * in your select statement. If you simply need a row number back you can query the fields you need and add in the standard field “rowid”.

posted this

Have an answer?

JD

Please login first before posting an answer.