Umlaut domains get converted to punycode in address bar
-
Re: Domains with umlauts -> url is converted
When I enter internationalized domains in Vivaldi they get converted to Punicode.
Example:
Entering sächsische.de in address bar
The sites gets rendered and address bar shows
It should be rendered in UTF-8?→ VB-46503
Thanks!
-
@eiapopeia See this earlier thread for example.
-
@eiapopeia Yes, it should be rendered in UTF-8 or at least in ISO-8859-1 because it its no mixed mode with e.g. cyrillic or greek.
I have poked the devs and duplicated your bug to VB-5510 (yes, that old), so watch out for that bug.
-
The NIC.br accepts the following characters in the URL:
à, á, â, ã, é, ê, í, ó, ô, õ, ú, ü, ç
Hoover the mouse over URL above, the status bar shows something like http://sächsische.de
Edited
-
https://www.denic.de/wissen/idn-domains/idn-zeichenliste/
DENIC, which is the correct registrar for this domain, accepts 93 characters apart from the so called "common" set. -
Gwen, so the VB-5510 depends on us, unlike I thought.
-
@quhno thx for the link.
-
@eiapopeia punicode directly maps to unicode (character) code points.
No need for different (intermediate) representation (unless some internal UI implementation detail is unable to cope with the correct one).The whole extended character set was/is generally a bad and security degrading idea.
Additionally to the impossibility for end users to distinguish conflicting visual representations or detect some other character combination tricks, the font used for display may just not have a required glyph. -
@becm
https://www.xn--schsische-v2a.de/
does not show as "sächsische" despite it maps, is allowed, is perfectly valid, is no mix with greek or latin script, so no danger, and despite @eiapopeia has, with a probability of 99,99999%, a font installed that contains all German Umlauts.
The "conflicting" problem was solved some time ago, there are established and tested checks since some time and that's why in the meantime all browsers show it as Umlaut, only Vivaldi doesn't.Having said that:
Vivaldi fails because the Devs still need to write the API which lets them access the necessary chromium code needed to solve the problem. -
@quhno I was not referring to this exact case, but to what are (mitigated but existing) problems with the general idea.
The coding/codepoint/render pipeline is apparently a non-trivial problem to solve as well (also in status bar).The special characters were not limited to the actually reasonable 4 by DENIC?! If even we Germans allow this kind of proliferation, one must fear how this is handled in the rest of the world.
And even then; try explaining umlauts to the average American ("there are TWO dots"). -
@becm Well, but the exact case was a good example - and there really is no problem as long as certain script combinations get puny-shed, which the underlying chromium code can do.
Yes, I know that vivaldi(dot)com is not the same as vіvaldі(dot)com, but this would be shown as punycode because it is a mix between 2 systems which are not allowed together.
-
Googled and found this old thread. Hope it's ok to 'revive' it.
Chrome and Firefox show internationalized domains with unicode. Edge (non Chromium version, haven't tested this one) uses punycode. So I guess Microsoft agrees with Vivaldi. But for unsophisticated users, seeing punycode might draw the opposite conclusion, and think it's dangerous. You write in one URL, and end up getting another.
Good thing all Vivaldi users are pros and understand what punycode is.
-
Old Opera showed punycode only if the top-level domain wasn't listed as "safe" (didn't have measures in place to prevent people from registering look-alike domains) - ideally Vivaldi should be doing the same. The example from the first post looks fine here.
-
@Gwen-Dragon said in Umlaut domains get converted to punycode in address bar:
This is really a hell with IDN domains. No progress to get this fixed.
Vivaldi has to give us an option in brow
ssers prefs for this, and a FIX like in Opera. IMO. I read this thread for a single German umlaut problem. What about us?
To get all my language [АБВГДЕЖЗИЙЍКЛМНОПРСТУФХЦЧШЩЪЬѢЮѪЯабвгдежзийѝклмнопрстуфхцчшщъьҍюѫя] is scrambled for the sake of a single average A? -
@Gwen-Dragon said in Umlaut domains get converted to punycode in address bar:
to see what a special а in your charset would cause
Just did.
Result: ERR_NAME_NOT_RESOLVED URL: http ://xn--ppl-5cd2a.comIs that really the case? I have to second my 1st reply. Users first.
-
A recent Krebs article showed this Unicode attack, there's a decent screenshot of a control panel.
https://krebsonsecurity.com/2022/11/disneyland-malware-team-its-a-puny-world-after-all/Aside from the security risk, the ASCII requirement is embedded in the protocols and Vivaldi is handlin' it on that level.
DNS is restricted in practice[a] to the use of ASCII characters, a practical limitation that initially set the standard for acceptable domain names. The internationalization of domain names is a technical solution to translate names written in language-native scripts into an ASCII text representation that is compatible with the DNS. Internationalized domain names can only be used with applications that are specifically designed for such use