uli I just sat down to spend five minutes on implementing dead keys, but I quickly realized that that may not be entirely trivial after all...
He, he, I thought that may be the case... 😅
uli Are the combined characters that are formed for a specific dead key (e.g. acute, grave, circumflex) the same for every keyboard layout in the entire universe, or are there exceptions? (I have not looked into that, but I would bet on the latter...)
If there are exceptions we need to encode all the combined characters in the key map. Any suggestions on how to go about that?
The combination is a bit complex, to be fair. I think it's better to use an example:
There are three elements in using diacritics: The diacritic alone, the glyph alone and the glyph with diacritic. An example of how you obtain the combined character:
Let's take "A". Its unicode number is \u0041. Now let's take the grave accent "`", \u0060. The way to get the A-grave character ("À" \u00C0) is to type the dead key "grave" (0x2f) and then "A" (0x04). In Spanish systems, and I guess is similar in languages with layouts that use dead keys, you normally get a visual indication that you've pressed a dead key:
That is the grave character alone (\u0060) with an indication of some sort (in this case, a blue underline). If you then press a key that can be combined, it will put that combination, for example "À" if you type "A" after the dead key, if you press space it will put the grave character alone "`", and if you press a key that can't be combined, you get both characters "`T" (for example if you pressed "T" after the grave dead key).
Now, some fonts have already combined characters. For example, the unicode A-grave is \u00C0. So, if you press the grave dead key and then "A", you should always get \u00C0 (À), and it's the same for the rest of the combined characters.
But here it gets complex: there's also a combination grave character, \u0300. It works like old typewriters if the font doesn't include the combined character: the character includes a virtual backspace, and allows for superposition, so you print the combination grave (and the cursor stays in place) and you superimpose the regular "A" character.
But I do include the already combined characters in all the fonts I make, so I think that can be ignored, if you want. It's an old mechanism inherited from typewriters.
So the process should be: get the dead key, print the diacritic with some indication (underline, inversion, etc) and then get the second key (could be lowercase, uppercase -shift+lowercase-, space or any other key), find in a table if there's a corresponding character for that combination (say \u00C0 if "A" was pressed after a grave dead key) and replace the accent with the combined character. If there's no combined character, just print the second character following the accent (or just the accent if space was pressed) and remove the "deadkey-pressed" indicator.
For every diacritic, there's only one possible outcome if there exists a combination. So for "`", "´", "¨", "^", "~" and "˚" if you press "A", you'll always get À, Á, Ä, Â, Ã, Å, and the same goes for E, I, O, U in Spanish, including lowercase. (Funnily, I had to copy "˚" from a web page, since my keyboard is unable to create that dead key, that's included in Nordic languages).
In Spanish there's also a funny one: Ñ. We have a dedicated key for it (0x33), but it can also be obtained by pressing the "~" dead key (AltGr + Ñ) and then "N" key after that. So much for redundancy! 😆
For the Latin alphabet, I always create combined characters for the letters A, E, I, O, U, W, Y and N (N tilde, or Ñ)
uli How do those keys in the layout above work that are dead keys but also have AltGr bindings (e.g. [)? I assume that those are semi-dead keys that are only dead for certain modifier states, right?
Yes, for "`" (0x2f) and "´" (0x34), they're dead keys when pressed alone or with Shift ("^" and "¨" respectively), but work as normal keys when used with AltGR, where you obtain "[" and "{" respectively.
These are the three states of my keyboard:
Normal, with the two dead keys:
Shift (same dead keys, but different diacritics):
And AltGr (the only diacritic in the Ñ key):
That's why I put the diacritics in the Spanish conf file with a backwards slash (\`, for example), the idea being that if you find that, you understand it's a dead key, and what diacritic to combine (the one right after the backwards slash) to get the appropriate result, but I don't know if it's the correct idea. In any case, you need a table of combined characters, to print for the dead keys combinations. If you need it, I can make you a list of the combined characters, and what the combination is (like `+A: À, ´+e: é, ^+i: î, etc).