The shady world of Brave selling copyrighted data for AI training

This article relates to the Brave Search API. It doesn’t go into much detail, but this appears to be about using the Brave Search API to return a 150 word+ “snippet” which you can then use to feed an LLM.

I know for a fact that Wikipedia operates under a CC BY-SA 4.0 license, which explicitly states that if you’re going to use the data, you must give attribution. As far as search engines go, they can get away with it because linking back to a Wikipedia article on the same page as the search results is considered attribution.

But in the case of Brave, not only are they disregarding the license - they’re also charging money for the data and then giving third parties “rights” to that data.


They posted an update to this based on Brave’s reply