Scrapy proxy integation

Scrapy Proxy Integration

What is Scrapy?

Scrapy is a Python framework for web crawling and scraping, which allows users to extract structured data from websites. It is open-source, fast, and extensible. Scrapy can be used for various purposes, such as data mining, monitoring, and automated testing.

Scrapy integration with Bright Data proxies

Open your preferred IDE and start a new scrapy project, type in the command line :

      scrapy startproject <project_name>
    

This will create a new folder with the project name, within the folder open a python file.

  • Go to your Bright Data Control Panel and clicking the ‘Proxies & Scraping Infra’ icon
  • Create a new proxy zone by clicking ‘Add’, choosing a network type, configuring the proxy, and clicking save
  • Under your proxy-zone’s ‘Access parameters’ tab, you will find your ‘USERNAME’ and ‘PASSWORD’ values.
  • In your scrapy spider code file, within the request’s meta parameter set the ‘proxy’ value to be the following, using the ‘USERNAME’ and ‘PASSWORD’ values from before: “http://USERNAME:[email protected]:22225
  • For Example:
      import scrapy

class BrightdatascrapyexampleSpider(scrapy.Spider):
   name = "BrightDataScrapyExample"

  def start_requests(self):
       request = scrapy.Request(url="http://example.com",callback=self.parse)
       request.meta['proxy'] = "http://USERNAME:[email protected]:22225"
       yield request

   def parse(self, response):
       print(response.body)
    

Then run the following command in your command line :

      scrapy runspider <Pythonfilename.py>
    

How To Use Bright Data Proxy Manger With Scrapy

  • Create a proxy zone same as in the direct integration above
  • Install the Proxy Manager
  • Click ‘add new port’ and configure it for your use case
  • In your Scrapy spider code file, within the request’s meta parameter set the ‘proxy’ value to be the following: “http://IP:PORTNUMBER”
  • The local host IP is 127.0.0.1 – this is the value you need to use if the proxy manager is installed on your machine. If the proxy manager is installed on an external server, input that server’s IP address
  • The port created in the Proxy Manager is 24XXX, for example, 24000 – the default first port number
  • For example:
      import scrapy

class BrightdatascrapyexampleSpider(scrapy.Spider):
   name = "BrightDataScrapyExample"

   def start_requests(self):
       request = scrapy.Request(url="http://example.com",callback=self.parse)
       request.meta['proxy'] = "http://127.0.0.1:24000"
       yield request

   def parse(self, response):
       print(response.body)
    

Get proxies for Scrapy

Proxy badges

Powered by an award-winning proxy infrastructure

Over 72 million residential IPs, best-in-class technology and the ability to target any country, city, ZIP Code, carrier, & ASN make our premium proxy services a top choice for developers.

About Bright Data proxies

Residential Proxies

  • 72,000,000+ IPs
  • Available in 195 countries
  • The largest rotating real-peer IP network
  • Access & crawl all sophisticated websites

Datacenter Proxies

  • 770,000+ IPs
  • Available in 98 countries
  • Shared and dedicated IP pools available
  • Access & crawl all sophisticated websites

ISP Proxies

  • 700,000+ IPs
  • Available in 35 countries
  • Real static residential IPs without IP rotation
  • Best for logging into multiple accounts

Mobile Proxies

  • 7,000,000+ IPs
  • Available in 195 countries
  • Largest real-peer 3G/4G IP network in the world
  • Verify mobile ads & crawl mobile sites

The best customer experience in the industry

You ask, we develop

New feature releases every day

24/7 global support

To answer any questions right when you need it

Full transparency

Real-time network performance dashboard

Dedicated Account Managers

To optimize your performance

Tailored solutions

To meet your data collection goals

The category leader in proxies and data collection

Category leader in proxies

650TB of public data collected every day

Number of ISP proxies

New feature releases every day

Serving 7/10 universities

Serving 7/10 of the world’s leading universities

Trust pilot ratings
4.6/5 Trustpilot rating

Ready to get Scrapy proxies?