Reddit is blocking Wayback Machine to archive users’ posts

Reddit will reportedly prevent Internet archived Wayback Machine from saving user posts. The social media platform notes that the measure is designed to stop AI companies from scratching archived comments to train their algorithms. Or at least, prevent them from doing so without paying.
Reddit reportedly focuses on search, with paid sublists reportedly reported
As The Verge reports, Reddit prevents Wayback Machine from archiving users’ post details, comments, and configuration files. The Reddit homepage is still fair to the game, meaning that the title of top posts will still be retained every day, but nothing else will be indexed in the Internet Archive’s digital library.
Reddit puts this decision as a protection to protect its users, noting that AI companies violated their policies by scratching data from Wayback Machine.
“until [the Internet Archive is] Reddit spokesman Tim Rathschmidt told The Verge.
Despite such assertions, Reddit shows that users’ data can be handed over to AI companies as long as they pay. In 2024, Reddit banned search engines such as Microsoft Bing and DuckDuckgo from crawling on its platforms. But a $60 million deal between Reddit and Google allowed the tech giant to continue training its AI algorithms on Redditors data and surface in searches. Reddit also reached a similar $60 million deal with Chatgpt Creator Openai.
Mashable Trend Report
“Without these protocols, we have no say or understanding on how data is displayed and how it is used, which puts us now in a position to stop people who are unwilling to use or not using data,” Reddit CEO Steve Huffman told Verge last August.
Ironically, Reddit users themselves have little say in how companies use public posts, because it does not allow them to opt out of such data or data used to train AI algorithms. The only remedy for redditors to prevent such use is to simply stop publishing it entirely to the platform, although this still doesn’t resolve their previous posts.
While the focus on user privacy may be a factor, Reddit’s decision to block the Wayback machine seems more clearly motivated by money. While AI companies are clearly crawling Reddit posts for free, cutting off such access will allow social media platforms to license such data for a large fee.
Hoffman told Data New York Times In 2023. “But we don’t need to donate all of this value to some of the largest companies in the world for free.”
Reddit has been working to reduce its financial losses in recent years, resulting in undesirable changes, such as charging developers to access their application programming interfaces (APIs), eliminating the ability to opt out of advertising personalization and planned paid subreddits. Unfortunately, there is still a long way to go when Reddit claws stand out from the red. The net loss claimed to be “Internet Heart” reported as high as US$484.3 million, more than five times the net loss in 2023.



