Unbiased = un-relevant - what's the solution?

i’m not here to knock Mojeek - it’s a much needed indexing search engine in a world full of filter bubbles, unethical Big Tech corporations with strong ties to intelligence communities, and radical political persuasions, however, Mojeek is currently not a great daily driver search engine for me

anyone who remembers Google in the early days probably found it as refreshing as i did - it actually returned relevant results! (no longer true depending on what you’re searching for)

i don’t know how Mojeek orders results, but it doesn’t always order them in the way one might expect

for example, if i search for ‘ford car’ (unquoted) in the U.S., i would expect ford.com to be the first result as it is on Bing and Google, however the domain doesn’t appear until the 2nd page of results on Mojeek

with sensitive stuff – political for example – i certainly don’t want to see a carbon copy of Google’s biased/censored/filtered/de-ranked results, but i also want to find the information i need quickly

does anyone else see this as a problem and, if so, how can it be addressed?

unlike searching for ‘ford cars’, following are 2 examples of where Mojeek does very well when searching for a highly politically charged subject, holocaust

the single keyword being ‘holocaust’, all 3 engines (Bing, Google Mojeek) return expected results:

Bing results
Google results
Mojeek results

however if we search for ‘holocaust revisionism’ (unquoted), Bing and Google fail miserably while Mojeek returns much better results - please ignore the sensitivity of this subject - it doesn’t matter what one believes, the point here is that, when searching for ‘holocaust revisionism’, we should be seeing results about revisionism, not the exact opposite, anti-revisionism, as is the case with Bing and Google:

Bing results
Google results
Mojeek results

so what needs to change in order to have Mojeek return great results in the cases of both sensitive subject matter and everything else?

i’ve been thinking about this for awhile and i have no great answers, but i have brought this up with Josh in email and following was my idea…

my humble suggestion would be to return the results exactly as they are by default, thus no additional server overhead, but offer an option, both in prefs and on the the search page, to sort the existing results in different ways, such as according to popularity (user clicks) which would not compromise privacy but which would, in many/most cases, put the desired result in the top 1, 5 or 10

I can’t speak for Mojeek, but I don’t think there’s such a thing as an “unbiased” search engine. If an engine was “unbiased”, rankings wouldn’t exist. There are various ways of determining relevance; AFAIK Google’s algorithm tends to rely on biases it believes you have. If your algorithm determines relevance based solely on SEO keywords, you’re going to end up with a lot of junk. See part of Seirdy’s post on Search Engines:

No search engine is truly unbiased. Most engines’ ranking algorithms incorporate a method similar to PageRank, which biases them towards sites with many backlinks. Search engines have to deal with unwanted results occupying the confusing overlap between SEO spam, shock content, and duplicate content. When this content’s manipulation of ranking algos causes it to rank high, engines have to address it through manual action or algorithm refinement. Choosing to address it through either option, or choosing to leave it there for popular queries after receiving user reports, reflects bias.

I think many would find Mojeek’s biases less objectionable than Google/Bing’s if they viewed a search engine purely as a tool. If you’re searching for Russian news sites, even if the content is purely a reproduction of the Kremlin’s script, you probably want to see Russian news sites.

That said, yes, I don’t tend to get what I’m looking for quickly depending on the complexity. Colin suggested one way this could be improved in the future in this great response, where he shares the concept of personalisation options where you can promote/demote sites based on your preferences and more:

On our chosen path of independence, we have been building the ability to explore premium/subscription services. We will have specific news to share on that very soon; a little taster, if you have not seen before, is here. It is a new product that we restarted work on last year. Version 1 is now nearing completion. The roots of this go back to February 2006 when Mojeek “Personal Search” was announced. More details were reported on here in September 2006. This project was put aside to concentrate on building the Mojeek general web search engine that you use, and we run today. The new product is something that we are intending to bring to this community first, ahead of any general release and after internal testing.

Another issue is that, while Mojeek’s index is large (5 billion pages!), Google and Bing have much larger indexes, so they have more to work with. For instance, Mojeek does not index any pages in Japanese, so it won’t return any results when you use a Japanese search term (it gives you a 403 Forbidden, actually). Conversely, as Mojeek’s index grows larger, it will be able to serve you better results.

submit-feedback-mojeek

One way to help improve Mojeek’s results is to provide feedback when you don’t get the result you’re looking for, using the “Submit feedback” button in the bottom right corner when searching.

Thanks @itsMe for this feedback, review and suggestions.

The ford.com website is 6th on the first page location set to the US when we checked. This might of changed youlooked or we are getting your location wrong. Location can be set in preferences and that has an effect on rankings. The reason it might have changed, and why the ford.com home page is missing, is because it’s on a server that we lost power to last week and it’s rebuilding. I’d expect the ford.com homepage to reappear within the next couple of days. Until then it’s hard to know if there are any other issues.

We are very well aware that we can sometimes struggle with navigational/named type queries. If you’re in the UK ford.co.uk homepage comes top. We might expect ford.com to jump back to the top when it 's recrawled; let’s see. We’ll have a new ranking algorithm available for testing soon, so we can also look at that.

On what we might call search neutrality, a word perhaps more representative of our approach than unbiased (@gnome) this is very well put:

it doesn’t matter what one believes, the point here is that, when searching for ‘holocaust revisionism’, we should be seeing results about revisionism, not the exact opposite, anti-revisionism, as is the case with Bing and Google

This is a specific case of a huge more general problem, that is becoming more and more prevalent on the Bing and Google search ecosystems.

The larger Mojeek’s index gets the more comprehensive it’s link popularity map will get. It will take time, especially if many gov’t, edu and other authoritative sites block crawling by anyone but Google and Bing.

Unfortunately, no search engine can really tell you what the quality of a page really is, only that it is popular. It can look for other indicators of quality (correct spelling, sentence complexity, punctuation, grammar, etc) but those are just superficial. In it’s beta years Google seemed more relevant because it spidered much deeper than any other search engine. Just finding obscure references was a big deal. A bit later Google relied on ODP (and I suspect Yahoo Directory) as an important indicator of both relevance and quality. So in spite of all the hype about Pagerank, Google at that time, still relied on human review by ODP editors to tell them about quality.

ah, yes, i had the pref set to ‘None’ - i forgot about that

1 Like

I’m very late to this thread, but I wanted to comment on this and signed up to do so:

“it doesn’t matter what one believes, the point here is that, when searching for ‘holocaust revisionism’, we should be seeing results about revisionism, not the exact opposite, anti-revisionism, as is the case with Bing and Google”

If someone searches for a phrase, they should just get web pages that contain the phrase, sorted as neutrally as possible.

If you search for “perpetual motion”, an unbiased search engine would return lots of examples of people saying that perpetual motion is nonsense, because that’s the context where the phrase most commonly appears. When we search for “perpetual motion”, we don’t expect a list of pages arguing that perpetual motion is possible because that’s not representative of the material being searched across.

The only way to achieve your suggestion would be if, when you searched for a theory by name, the search engine gave extra weight to web pages that argued for the theory, and discarded or marginalised those that argued against it. That would be a very biased search engine indeed!

2 Likes

Welcome @Jags and thanks for sharing your views. Much to agree with. Mojeek is not interested in who you are, and we don’t presume to know what you want or need. It is not for us to impose a viewpoint nor adjudicate on consensus. Truth emerges from a diversity of, and (ideally free) access, to information. Our job is to suggest to you, places on the web that might provide such information. And give you as much control as we can over that process.

2 Likes