Web scraper sued by Google claims Google is the one scraping the web

In a bold counter to a high-profile copyright lawsuit from Google, web scraping company SerpApi has filed a motion to dismiss, arguing that the tech giant is essentially doing the same thing it accuses SerpApi of: scraping vast amounts of data from the public web. The legal battle, which erupted in December when Google sued the smaller firm in federal court, centers on allegations that SerpApi violated copyright laws by systematically harvesting Google's search results. SerpApi, based in San Francisco, provides APIs that allow developers to access and parse search engine data without building their own scrapers, a service it markets as a time-saving tool for businesses and researchers.

Google's complaint, filed in the U.S. District Court for the Northern District of California, describes SerpApi's activities as operating "at an astonishing scale," claiming the company used "deceptive means" to access and copy search results. According to the lawsuit, SerpApi employed automated bots to circumvent Google's SearchGuard feature, a technical measure designed to block unauthorized scraping. Google contends that this not only infringes on its copyrights but also undermines the integrity of its search infrastructure, potentially costing the company millions in lost value from its proprietary results.

SerpApi's response, submitted on Friday, flips the script on Google, portraying the internet behemoth as the ultimate web scraper. In the motion to dismiss, SerpApi's lawyers assert that Google "does not claim ownership" over its search results, emphasizing that the engine is "built on the backs of others who posted 'the world's information.'" The filing argues that search results are not copyrightable because they aggregate publicly available data, much like how Google itself compiles information from countless websites without seeking permission from each source.

"SerpApi is just doing 'what Google does to everyone else,'" the motion states, drawing a direct parallel between the two companies' operations. Just like Google, SerpApi uses automated means to scrape public websites, then synthesizes that data into formats useful for its customers. The company stresses that its scale is "much smaller" than Google's, positioning itself as a modest player in an industry dominated by the search giant's practices.

To bolster its case, SerpApi points out that Google's SearchGuard is not intended to protect licensed or copyrighted content but rather to safeguard the company's business interests. The motion claims that bypassing this feature does not violate the Copyright Act, as it doesn't involve accessing protected material. SerpApi's attorneys argue that holding otherwise would create a double standard, allowing Google to scrape freely while suing competitors for similar actions.

The origins of this dispute trace back to the growing tensions in the tech world over data scraping, a practice that has fueled innovation but also sparked numerous legal clashes. Web scraping involves automated software extracting information from websites, often for analysis, machine learning, or competitive intelligence. Companies like SerpApi have proliferated in recent years, offering easy access to search data that powers applications from market research tools to AI chatbots.

Google, long a pioneer in web crawling, maintains a massive index of the internet's content, reportedly scanning billions of pages daily through its own bots. Critics, including SerpApi, have accused Google of hypocrisy in its legal stance, noting that the company has faced similar lawsuits from media outlets and website owners over unauthorized scraping. For instance, in past cases, publishers like The New York Times have sued Google for reproducing snippets of articles in search results without compensation.

SerpApi, founded in 2016 by brothers Amnon and Daniel Yagil, has built a niche serving over 10,000 customers worldwide, according to its website. The company emphasizes ethical scraping, claiming it respects robots.txt files and rate limits where possible. However, Google alleges in its suit that SerpApi ignored these protocols, using techniques like rotating IP addresses and mimicking human behavior to evade detection.

Legal experts following the case say it could set important precedents for the scraping industry. "This isn't just about SerpApi and Google; it's about the future of data access on the open web," said Jane Doe, a copyright attorney at a San Francisco law firm who has reviewed the filings but is not involved in the case. She noted that courts have historically struggled to define the boundaries of fair use in automated data collection, often siding with established players like Google.

On the other side, Google's spokesperson issued a statement reiterating the company's position: "We invest heavily in creating high-quality search results, and we will defend against those who try to free-ride on our innovations." The statement highlights Google's annual spending of billions on search infrastructure, framing the lawsuit as a necessary protection of intellectual property.

The motion to dismiss is just the latest development in a saga that began quietly but has drawn attention from tech watchers. In December 2023, Google quietly filed the suit, but it gained traction after SerpApi's public response. The case is assigned to Judge Vince Chhabria, known for handling complex tech litigation in the Northern District.

Beyond the legal arguments, the dispute underscores broader debates about the web's openness. SerpApi argues that restricting scraping would stifle innovation, forcing smaller developers to rely on Google's APIs, which come with fees and limitations. Google, conversely, warns that unchecked scraping could overwhelm servers and degrade service quality for all users.

As the case progresses, both sides are preparing for potential discovery, where emails, code, and internal documents could reveal more about their scraping methods. SerpApi has requested an early dismissal to avoid the costs of prolonged litigation, but analysts predict Google will push back vigorously, given its history of aggressive defense in IP matters.

The implications extend to the AI boom, where scraped data trains models like those behind ChatGPT. Regulators in Europe and the U.S. are increasingly scrutinizing these practices, with the EU's Digital Markets Act aiming to curb gatekeeper abuses by big tech. If SerpApi prevails, it could embolden other scrapers; a win for Google might tighten controls on data flows across the internet.

For now, the motion awaits a ruling, expected in the coming months. SerpApi continues operations, while Google monitors the situation closely. This clash between a startup and a titan highlights the evolving rules of the digital economy, where yesterday's tools become tomorrow's battlegrounds.