This isn’t good, considering how large the percentage of total searches Reddit tends to make up for most people. On the other hand, I’ve had Reddit blocked in my /etc/hosts file for a few months now, so at least nothing will change for me with Mojeek.
Yeah a lot of people off the back of this (at least have said that they) consider this to be a positive with comments like:
I don’t see this as a loss at all.
As @Colin said to a journo at Ars Technia, quoted in this piece:
The CEO added that while being blocked by Reddit alone “is not a huge deal,” he is concerned about the precedent it could set. “Search engines are the main traffic source for most websites, and a spreading of this behavior will further choke off traffic. And smaller sites will be impacted even more than large sites,” he said.
I agree; what this possibly portends is very much more important than this one website.
It’s not a lot different from paywalling in one respect. We might see a further increase of the big platforms abandoning the web with walling of gardens, but the medium and long tail can’t practically act like them, and will still need and appreciate the search engine traffic.
Reddit have acted with a blunt instrument which comes down on both AI crawlers/scrapers (understandably) and search engine crawlers. Others have in the main been less blunt and applied blocks on AI crawlers but allow search engine crawlers in. One would hope that the more intelligent and nuanced behaviour will prevail and also new protocols emerge which are respected.
i hate that Moj obeys crawl restrictions (robots.txt, etc.) but i understand the reason for it, at least to some degree, legal being one
the problem is that sites like reddit hurt people by pulling this crap, whereas a crawler that tells them to stick it where the sun don’t shine isn’t hurting anyone, plus they’re helping people
i can see why a corporation can’t do this, which is why an open-source, uncensorable, decentralized, p2p index is so desperately needed (yeah, there’s a few, but none that have a lot of content)
we do now?
the only reasonably fair and open net is the “dark” web