Back in 2022, I started a company called Product Sonar that aimed to scrape ecommerce prices from popular websites in the hardware store industry. Sound pretty simple right? Pull up Screaming Frog or Scraping Bee and you’re off to the races right? Wrong.
These sites have the likes of Akamai and CHEQ, with a lot of resources and former Israeli defence programmers. Good luck!
But really… like how hard can it be?
Here’s my story of how hard it was and how I was actually able to get past the countermeasures with ChromeWebdriver and a bunch of clever programming.
What’s protecting the castle
Most bot detection systems work like this
To consistently scrape a site, you need to thread the needle and handle all the yellow boxes. So let’s break this down.