Stop Scrapers and Scalpers with Akamai Content Protector
Welcome to the era of evasive web scrapers — and to Akamai’s tailored solution to stop them: Akamai Content Protector.
In ecommerce, scrapers play a critical and productive part of the ecosystem; for example, search bots can ping for new content that you want to show up in public searches, consumer shopping bots can highlight your products in comparison sites, bots can efficiently gather the latest product information for your partners to share with their customers, and so forth.
Unfortunately, scraper bots can also cause a range of problems, including slower sites, lower conversion rates, competitive losses, and counterfeiters who use your content to make their goods look legitimate. And scrapers continue to become more evasive and sophisticated.
Attackers’ increased profit potential
Before the COVID-19 pandemic, scraper bots were generally considered to be unsophisticated and relatively easy to detect. Starting in 2020, however, the profit potential for attackers rose due to several factors, including:
Supply chain shocks and shortages, including everything from basic groceries and baby formula to kitchen appliances and cars
The scarcity of vaccines (and vaccine appointments!) in the early days of the pandemic
The popularity of airline tickets and hotel reservations once everyone started traveling again
The feverish desire for traditionally hot items, like concert tickets, as entertainment-starved populations wanted to go out again
Profit potential drives bot operators’ motivation
These money-making opportunities are motivating bot operators to innovate. Scrapers became more specialized and sophisticated by using unique evasions engineered by multiple attackers who combined to launch well-funded attacks on highly researched targets.
Scrapers also use techniques that are unique to this class of bot — therefore, they require detections that specifically look for these unique techniques. In fact, in most cases, scraping attacks use a combination of bots and other techniques (like plug-ins) to carry out the attack chain.
What are the detrimental effects of scraping attacks?
Scraping attacks can result in many expensive issues for organizations, including:
Expensive decision-making errors. When firms can’t tell bot traffic from human traffic, they make bad decisions about which products are popular and how to optimize their marketing results.
Higher IT costs. Scrapers run continuously until stopped, so they increase server and delivery costs as organizations serve unwanted bot traffic.
Site performance degradation. Organizations provide an impaired user experience because of slower site and app performance.
Lower sales conversion rates. Consumers despise slow sites. And when scrapers harm site performance, those consumers shop elsewhere. Cart abandonment and fewer return site visits mean lower conversions and fewer sales for transactional sites.
Competitive intelligence/espionage. Competitors scrape information from an organization’s site to undercut pricing and make other changes to their own offers, ensuring a constant arms race to win customers.
Inventory hoarding/scalping surveillance. Scrapers (sometimes called scalpers in this case) are the first step in an inventory hoarding attack chain. Scalpers ping targeted sites constantly to find available products, then they add them to carts, making those products unavailable for real customers.
Imposters pretending to be the original organization or products. Counterfeiters use scraped content to make fake sites and product catalogs to trick users into thinking they’re buying legitimate goods instead of counterfeits.
Stealing audiences and “eyeballs” from media companies. Attackers can scrape content and place it on their own sites, causing the legitimate organization to lose visitors and potential advertising revenue.
Akamai Content Protector: a specialized solution for scrapers and scalpers
Akamai is introducing Content Protector, a solution developed to stop harmful scrapers without blocking the good scrapers that companies need for their business success. Content Protector includes detections that are specifically tailored to identify damaging scraping attacks.
Content Protector’s tailored capabilities include:
Detection
Risk classification
Response strategy
Detection
Detection includes a set of machine learning–powered detection methods assesses the data collected on both the client side and the server side:
Protocol-level assessment. Protocol fingerprinting evaluates how the client establishes the connection with the server at the different layers, and verifies that the parameters negotiated align with the ones expected from the most common web browsers and mobile applications.
Application-level assessment. This assessment evaluates if the client can run some business logic written in JavaScript. When the client runs JavaScript, Content Protector collects the device and browser characteristics and user preferences (fingerprint). These various data points will be compared with and cross-checked against the protocol-level data to verify consistency.
User interaction. Metrics evaluate that a human interacts with the client through standard peripherals like a touch screen, keyboard, and mouse. A lack of interaction or abnormal interaction is typically associated with bot traffic.
User behavior. This capability analyzes the user journey through the website. Botnets typically go after specific content, which results in significantly different behavior than legitimate traffic.
Headless browser detection. This is a custom JavaScript that runs client-side to look for indicators left behind by headless browsers even when running in stealth mode.
Risk classification
Content Protector provides a deterministic and actionable classification of the traffic (low risk, medium risk, or high risk) based on the anomalies found during the evaluation. Traffic classified as high risk must have a low false positive rate.
Response strategy
The product comprises a set of response strategies, including the simple monitor-and-deny action and more advanced techniques, such as a tarpit, which purposely delays incoming connections, or various types of challenge actions. Crypto challenges are generally more user-friendly than CAPTCHA challenges when trying to lower the false positive rate.