Google Search Really Has Gotten Worse, Researchers Find

gnome · 17 January 2024 23:17

This is an article on the topic of search engines (the ones tested are Google, Bing, and Duckduckgo) and how they are being overwhelmed by spam. The article actually concludes that Google is doing a better job of managing spam than Bing-based search engines:

AI-generated content seems to be making managing spam a lot harder. Predictably. Somewhat ironically, some users are turning to LLMs like ChatGPT to get answers to questions they would normally submit to a search engine.

I’m somewhat undecided on how bad the spam issue is with Mojeek. It was bad a few years ago when I would get low quality results at the top and sometimes not what I was looking for at all, but the results have improved a lot.

Josh · 18 January 2024 18:44

This is an interesting one, as this is academic work being done around something which I’ve seen a lot of people say, but not investigate in any concerted way. I read the piece itself, but the first time I saw it I actually went straight for the research, which is here. It is all of 16 pages with two being references, and it is not in any way hard to get through so I would recommend it.

They monitored Google, Bing, and DDG for just under 7,500 product review queries, it’s a decent sample for sure and pulled from an area which is quite spam prone.

Focussing on the product review genre, we find that only a small portion of product reviews on the web uses affiliate marketing, but the majority of all search results do.

We further observe an inverse relationship between affiliate marketing use and content complexity, and that all search engines fall victim to large-scale affiliate link spam campaigns. However, we also notice that the line between benign content and spam in the form of content and link farms becomes increasingly blurry—a situation that will surely worsen in the wake of generative AI.

Affiliate links tend to sit on pages which have not had much effort (if we can proxy this for complexity) put into them and yet this study shows them to be around a lot in results from these data. The thought is put into the ranking and not into the quality, a problem for sure.

SEO is a double-edged sword. On the one hand, it makes high-quality pages
easier to find, but is on the other hand also a sharp tool for pushing up low-
quality results in the search rankings.

It also looks like they had to use Startpage to get the Google results, even researchers get captcha’d. When I saw that I thought “It’s interesting they had that information on SP but not on DDG”

The SERPs were scraped repeatedly over the course of a year from Startpage (a privacy frontend to Google), Bing, and DuckDuckGo.

but then

Although DuckDuckGo claims to utilize many different data sources, we found the results to be extremely similar to Bing.

I’ve set a note to myself to go back to this because it’s got a lot in it. If anyone else has given it a read then I’m interested in takes.

mike · 20 January 2024 10:18

If we limit the discussion to an improved algorithm, I don’t see an easy solution. I agree that search engines fundamentally understand our information needs. But measures of page quality appear to correlate with spam.

I think Mojeek Focus is a good tool. But it needs work.

Josh · 22 January 2024 10:45

Any particular features or changes you’d like to see?

mike · 22 January 2024 11:37

For example, I can’t add a path to a host. So, I can’t put the url https://www.youtube.com/@GamersNexus in a Focus. I would frame that as saying Focus is not compatible with “creators”: where a lot of editorial content comes from these days.

I would also create a default “Product Review” Focus; reduce the steps to “copy and edit” a Focus; and include an interactive guide.

I would also fundamentally change the view for Focus. I would move the results into the main result pages. And I would tag entries with a Focus. This opposed to having a separate, filtered view: as now. The purpose would be to make it a lot easier to add and remove sites (URLs) from a Focus. And clicking on a tag (beneath a result) would take you to the current, filtered view for that tag.

My basic assumption is that Focus is good for reference (dictionary; atlas), but it is less useful for one-off searches like products. Especially, without a default product review Focus, it is unlikely someone will take the time to assemble a Focus to use it once.

snatchlightning · 18 April 2024 03:14

This sounds like Goodhart’s law: “When a measure becomes a target, it ceases to be a good measure”. I’ve read that recipes have become lengthy due to Google using the word count as a metric.

I don’t see an easy solution either, but it might require thinking outside-the-box, that to not rely on algorithm alone. I suggest something similar to spam filters, adding observation of behaviors and user input to the mix. This will not completely eliminate the problem, but will help a lot in drastically minimizing the low-quality ones.

Your conclusion matches this article’s, with curating content being the solution suggested to combat the deluge of SEO garbage, essentially giving back some control in decision-making to humans. Focus does exactly this by having lists of websites compiled by people. Another would be collecting forums and blogs on separate tabs, like what Marginalia Search does.