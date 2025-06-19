-
kyu3a Ambassador
Hi, I have questions. How does the Vivaldi staff think about Vivaldi Social posts being used to learn by
generative AI?
Is this instance blocking (or stating not to crawl) the crawler for learning generative AI?
Pathduck Moderator Soprano Supporters
@kyu3a Looks like they don't want GPTbot at least...
λ curl "https://social.vivaldi.net/robots.txt" # See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file User-agent: GPTBot Disallow: / User-agent: * Disallow: /media_proxy/ Disallow: /interact/
Problem is, there's hundreds of these friggin' evil bots, and they keep coming, spamming the web servers with requests for their VC-funded techbro owners. And there's no guarantee they will even respect robots.txt. Only solution is to block their IPs but they often run in cloud data-centers.
Here's a REAL blocklist:
https://www.jwz.org/robots.txt
Only real solution is to burn these companies to the ground before they destroy the web and the planet.
greenenemy
Just assume that everything that you put on the internet can and will be used to train AI legally or not, it is what it is
@greenenemy, any search engine also do it. There are always crawler and spider bots from any search engine, collecting data on all sites with public access, also here in the forum. Bothing new