Docker: scale up a tor proxy for web scrapping
A tor proxy can be used for web scrapping operations, it has the benefit that the tor ip is updated every some minutes an even in case of…
A tor proxy can be used for web scrapping operations, it has the benefit that the tor ip is updated every some minutes an even in case of an IP ban you have a new IP in a few minutes, but even this has its problems, you still have only one tor IP to do requests in a very specific limit and you have to wait for a new IP in case of ban, all those problems can be mitigated using multiple tor proxies over docker.
How to create a docker tor proxy service
We will use the peterdavehello/tor-socks-proxy:latest image!$ docker service create -p9150:9150/tcp --name tor peterdavehello/tor-socks-proxy:latest
We can verify that the tor service started and we got a tor IP assigned$ curl --socks5-hostname 127.0.0.1:9150 https://ipinfo.tw/ip
185.220.101.19
Now we can use the socks5 tunnel with our application.
How to scale the tor proxy service
To scale the tor service is super easy using the docker service scale command, we will scale up our service with 10 tor ip addresses$ docker service scale tor=10
Now we can verify that we have ten tor addresses with the following command; notice that we get as a response each assigned ip of the tor containerkpatronas@prometheus:~$ curl --socks5-hostname 127.0.0.1:9150 https://ipinfo.tw/ip
109.70.100.34
kpatronas@prometheus:~$ curl --socks5-hostname 127.0.0.1:9150 https://ipinfo.tw/ip
104.244.76.127
kpatronas@prometheus:~$ curl --socks5-hostname 127.0.0.1:9150 https://ipinfo.tw/ip
192.42.116.26
kpatronas@prometheus:~$ curl --socks5-hostname 127.0.0.1:9150 https://ipinfo.tw/ip
185.220.100.255
kpatronas@prometheus:~$ curl --socks5-hostname 127.0.0.1:9150 https://ipinfo.tw/ip
109.70.100.33
kpatronas@prometheus:~$ curl --socks5-hostname 127.0.0.1:9150 https://ipinfo.tw/ip
5.45.106.207
kpatronas@prometheus:~$ curl --socks5-hostname 127.0.0.1:9150 https://ipinfo.tw/ip
185.14.97.176
kpatronas@prometheus:~$ curl --socks5-hostname 127.0.0.1:9150 https://ipinfo.tw/ip
89.58.16.21
kpatronas@prometheus:~$ curl --socks5-hostname 127.0.0.1:9150 https://ipinfo.tw/ip
109.70.100.83
kpatronas@prometheus:~$ curl --socks5-hostname 127.0.0.1:9150 https://ipinfo.tw/ip
109.70.100.34
Did you also noticed that the last IP is the same as the first? this is because docker does a round robin load balancing, it did 10 requests and started all over again.
I hope you enjoyed this article! :)