HTTP 400 Bad Request

Sometimes when I select a string and search using my browser’s right-click menu, the search string includes a percent-encoded newline: %0A This consistently generates a 400 from Mojeek. For example:

https://www.mojeek.com/search?q=Lake+Pend+Oreille%0A

Most other search engines can handle this character. Can Mojeek strip this out and perform a search with the remaining characters?

hey @mike, can I check that this is with Vivaldi, or is it another browser/multiple? I can then raise an issue for it :pray:

Looks like it only happens with Vivaldi.

Peek 2023-09-12 11-24

https://www.mojeek.com/search?q=%s&t=34&cdate=1&ib=0&dlen=0&qss=Bing%2CDuckDuckGo%2CGoogle&rp_i=0&qsba=1

1 Like

I can replicate this on Vivaldi.

When I tripple click to select multiple words (like a span or paragraph) and then right click ans search with Mojeek, %0A%0A gets added to the end of the search string. This results in the 400 error. This is exact what we can see in @mike’s video. When double clicking (to select a single word), there’s no problem.

When I click and drag to select and search using the context-menu, the extra characters don’t get added and the search words fine.

1 Like

Cheers all, raised :pray:

I was triple-clicking on menu headings today and then searching using Vivaldi’s right-click context menu. That was consistently giving me HTTP 400’s from Mojeek. I didn’t have a problem sending the same strings to DuckDuckGo.

https://www.mojeek.com/search?q=Lo+Mein%0A%0A

https://www.mojeek.com/search?q=Chop+Suey%0A%0A

# from 
https://www.chinakingharwoodheights.com/menu

Thanks for this @mike, I’ve added that in. This is something we’re keeping it in mind.

1 Like

Hi all,
I was about to open a new thread when I saw this one.
I just found a (probable) bug with Unicode characters and the related sanitation process.
First, I present an example. When inputting a string with tabulations, like “neil armstrong”, which has a tabulation between the two words, the URL gets translated to “mojeek .com/search?q=neil%09armstrong”, and Mojeek returns a “400 Bad Request” error. This is present in both the search from the website and the search from the URL bar (on Firefox), and the error appears with the first 32 ASCII characters, from %00 (che NUL byte) up to %1F, plus the DEL character (%7F), which are special “non printable” chars.

Now, I think the explanation is simple and straightforward: the character simply gets converted into its Unicode representation; 09 is the Unicode encoding of the horizontal tabulation. So I tried including an escape character (U+001B), and indeed the string gets parsed in the URL as “mojeek .com/search?q=neil%1Barmstrong” (and it still returns a 400 error). And by searching “mojeek .com/search?q=neil%26armstrong” Mojeek correctly returns “neil&armstrong” instead of 400 error.

However, manually creating the URL like “mojeek .com/search?q=neil armstrong” —or with any other Unicode character for that matter, like “mojeek .com/search?q=neil🚀armstrong”— and pasting it in the URL bar and hitting enter does work, as Mojeek is correctly sanitising the input and interpreting it. Only some non-printable Unicode characters seem to not work; searching “neil🚀armstrong” directly from the Mojeek search bar, for example, does work.

So I don’t know what goes wrong where, but clearly the URL is not properly interpreted. I tried with other browsers, and still get the same error. Other search engines have no problem, and I can search every string with every Unicode character, and gets properly sanitised and interpreted (e.g. “google .com/search?q=neil%09armstrong” and “search .brave .com/search?q=neil%09armstrong” do correctly return results).

(I had to insert spaces to break URLs because I’m a new user)

1 Like

We’ve logged to investigate this to check whether or not it occurs frequently enough to reconsider the settings. Is the tab over space coming from a particular method/setup or pasting the query in from a document?

Hi,
the tabulation come from copying something from other webpages or documents. Often it comes from copying Excel data, since it is separated by tabulations, and it’s very annoying when I copy multiple cells and I have to manually insert spaces between all the text contained in them after copying it because of this bug.

Thanks for clarifying that, it makes sense. I’ll add it into the log above :pray: