Perplexity, an AI search engine startup, is embroiled in controversy following a recent investigation by Wired. The report alleges that Perplexity has been scraping content from websites that explicitly prohibit such actions.
As FastCompany stated, Perplexity’s “answer engine” operates by crawling vast amounts of web content to create a searchable database (index) of information. Instead of users inputting keywords into a search box, they pose questions through Perplexity’s web portal or app, receiving narrative answers that include citations and links to sources gathered from its web crawls.
Many websites employ the Robots Exclusion Protocol to block web crawlers from accessing their content. While compliance with these protocols is voluntary, Wired and independent researchers claim to have evidence that Perplexity has disregarded these agreements and continued to scrape content from sites that have explicitly forbidden it.
Aravind Srinivas, CEO of Perplexity, addressed these accusations in a recent interview, stating, “Perplexity has not ignored the Robots Exclusion Protocol, nor have we misled anyone.” He emphasized that Perplexity relies not only on its own web crawlers but also on third-party providers for content indexing services.
Srinivas clarified that the mysterious web crawler identified by Wired does not belong to Perplexity but to a third-party supplier specializing in web crawling and indexing services. Due to confidentiality agreements, Srinivas declined to disclose the name of this supplier and was ambiguous when asked if Perplexity had immediately contacted them to cease scraping Wired’s content.
Regarding the Robots Exclusion Protocol, which dates back to 1994, Srinivas pointed out that it is not a legal framework. He argued that the emergence of AI necessitates a new working relationship between content creators or publishers and websites hosting their content.
Wired further alleged that by prompting Perplexity’s answer engine with titles or content from Wired articles, users could provoke Perplexity into closely paraphrasing or inaccurately rewriting Wired’s reports. In one instance, Perplexity erroneously claimed that a California police officer had committed a crime.
In response, Srinivas suggested that Wired designed prompts to elicit such behavior from the Perplexity tool, behavior that normal users would not typically encounter. He added, “We have never claimed that we do not generate these results.”
In early June, Forbes accused Perplexity of content theft. This followed Perplexity’s launch of a new product named “Pages” in May, allowing users to create articles or blog posts based on a series of questions posed to the answer engine or a single prompt on a specific topic. Users can add AI-generated or uploaded images and adjust text or formatting before publishing online.
Pages by Perplexity featured content from exclusive Forbes reports without proper attribution. Perplexity even created an AI-narrated podcast based on Forbes’ reporting, also lacking proper sourcing.
Since its launch, Perplexity has made accurate source attribution a core principle of its Pages product. Following Forbes’ concerns, Srinivas informed Fast Company that updates to Pages now include source tags in generated article texts.
Srinivas has consistently stated that Perplexity’s success hinges on fostering a healthy internet ecosystem. “We are happy to build a business with lower market value and profit margins, as long as we can profit and succeed, ensuring victory across the internet,” he told audiences at Fast Company’s Most Innovative Companies Awards in May. “If people cannot create new content online, Perplexity is useless.”
He disclosed ongoing negotiations with selected publishers on revenue-sharing agreements but did not reveal the names of these publishers, leaving uncertainty about whether Conde Nast (Wired’s owner) or Forbes are involved in these discussions. Wired’s findings regarding content scraping and indexing issues could expedite Perplexity’s plans to reach fair agreements with publishers.
While publishers express concerns over their content being utilized by Perplexity, the startup continues to garner support as it challenges Google’s dominance in search engines with a novel approach. Nevertheless, losing further support could jeopardize Perplexity’s continued development.
With a keen interest in tech, I make it a point to keep myself updated on the latest developments in technology and gadgets. That includes smartphones or tablet devices but stretches to even AI and self-driven automobiles, the latter being my latest fad. Besides writing, I like watching videos, reading, listening to music, or experimenting with different recipes. The motion picture is another aspect that interests me a lot, and I'll likely make a film sometime in the future.