getting while trying to parse page with scrapy

800 views python
9

When i'm trying to get all page content i'm getting this error in console

  2018-11-08 20:55:34 [scrapy.core.engine] INFO: Spider opened
2018-11-08 20:55:34 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-11-08 20:55:34 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2018-11-08 20:55:34 [scrapy.core.engine] ERROR: Error while obtaining start requests
Traceback (most recent call last):
  File "c:\python36\lib\site-packages\scrapy\core\engine.py", line 127, in _next_request
    request = next(slot.start_requests)
  File "c:\python36\lib\site-packages\scrapy\spiders\__init__.py", line 83, in start_requests
    yield Request(url, dont_filter=True)
  File "c:\python36\lib\site-packages\scrapy\http\request\__init__.py", line 25, in __init__
    self._set_url(url)
  File "c:\python36\lib\site-packages\scrapy\http\request\__init__.py", line 62, in _set_url
    raise ValueError('Missing scheme in request url: %s' % self._url)

This is how my code look like

import scrapy

class Shopee(scrapy.Spider):

    name = 'Shopee'
    start_urls = ['http://www.shopee.sg/Games-Hobbies-cat.14']


    def parse(self, response):
        print(response.text)

answered question

The error seems correct - according to isitup.org, the site is down: isitup.org/www.clearshopee.sg

Yeah, the URL is down. That is why.

i'm sorry i corrected link and updated error

1 Answer

0

Try changing it to

self.start_urls = ["http://www.shopee.sg/Games-Hobbies-cat.14"]

posted this

Have an answer?

JD

Please login first before posting an answer.