9
submitted 1 week ago* (last edited 1 week ago) by zkfcfbzr@lemmy.world to c/python@programming.dev

I have a rather large Python script that I use as basically a replacement for autohotkey. It uses pynput for keyboard and mouse control - and at least on Windows, it works exactly how I expect.

I recently started dual-booting with Linux and have been trying to get the script to work here as well. It does work but with mixed results - in particular, I found that pynput has bizarrely wrong output for special characters, in a way that's both consistent and inconsistent.

The simplest possible case I found that reproduces the error is this script:

import time
from pynput import keyboard

# Sleep statement is just to give time to move the mouse cursor to a text input field
time.sleep(2)

my_kb = keyboard.Controller()

text = 'πŸ†' # Eggplant emoji
my_kb.type(text)

time.sleep(1)

text = 'π•₯𝕖𝕀π•₯' # blackboard bold test
my_kb.type(text)

time.sleep(1)

text = '𝐭𝐞𝐬𝐭' # bold test
my_kb.type(text)

When I run that script right now, it produces the output "πŸ†π•₯π•₯𝕀π•₯𝐭𝐭𝐬𝐭". And if I run it again, it'll produce the same output. And if I change the eggplant emoji to something else, like the regular character 'A', it will still produce the same output (specifically "Aπ•₯π•₯𝕀π•₯𝐭𝐭𝐬𝐭"). But... If I log out and log back in, then the output changes to something else that's still wrong, but differently. For example, when I changed the eggplant to a regular 'A', then relogged, the output became "Aπ•₯𝕖𝕖π•₯𝐭𝐞𝐞𝐭". And then that wrong output will keep being the same wrong output until I log out and back in again. If the test strings don't change, then the incorrect outputs don't change on relog - but if they do, then they do.

In the larger script, errors seemed to chain together somehow - like if I produced an eggplant emoji, then tried to write blackboard bold test, I would get "πŸ†π•–π•€πŸ†". This is despite verifying just before running the pynput.keyboard.Controller.type function that what it was about to type was correct. The issue also happens if I type it character-by-character with press and release functions.

I am very new to Linux. I'm on Linux Mint. I'm running this in a python3 venv that just has pynput and two other external libraries installed. ChatGPT thinks the issue might be related to X11. The issue does not occur at all on Windows, using the exact same code. On Linux there seems to be no issues with typing regular text, just special characters.

top 14 comments
sorted by: hot top controversial new old
[-] logging_strict@programming.dev 1 points 23 hours ago

so for 1 byte characters has both upper and lower case forms

def keysym_group(ks1, ks2):
    """Generates a group from two *keysyms*.

    The implementation of this function comes from:

        Within each group, if the second element of the group is ``NoSymbol``,
        then the group should be treated as if the second element were the same
        as the first element, except when the first element is an alphabetic
        *KeySym* ``K`` for which both lowercase and uppercase forms are
        defined.

        In that case, the group should be treated as if the first element were
        the lowercase form of ``K`` and the second element were the uppercase
        form of ``K``.

    This function assumes that *alphabetic* means *latin*; this assumption
    appears to be consistent with observations of the return values from
    ``XGetKeyboardMapping``.

    :param ks1: The first *keysym*.

    :param ks2: The second *keysym*.

    :return: a tuple conforming to the description above
    """
[-] logging_strict@programming.dev 1 points 23 hours ago

Solves the mystery of the repeating entries

1 2 and 3 bytes unicode to corresponding keysym

mapped into that tuple. Seems author likes the number 4.

def keysym_normalize(keysym):
    """Normalises a list of *keysyms*.

    The implementation of this function comes from:

        If the list (ignoring trailing ``NoSymbol`` entries) is a single
        *KeySym* ``K``, then the list is treated as if it were the list
        ``K NoSymbol K NoSymbol``.

        If the list (ignoring trailing ``NoSymbol`` entries) is a pair of
        *KeySyms* ``K1 K2``, then the list is treated as if it were the list
        ``K1 K2 K1 K2``.

        If the list (ignoring trailing ``NoSymbol`` entries) is a triple of
        *KeySyms* ``K1 K2 K3``, then the list is treated as if it were the list
        ``K1 K2 K3 NoSymbol``.

    This function will also group the *keysyms* using :func:`keysym_group`.

    :param keysyms: A list of keysyms.

    :return: the tuple ``(group_1, group_2)`` or ``None``
    """
[-] zkfcfbzr@lemmy.world 1 points 10 hours ago* (last edited 9 hours ago)

disregard this comment

[-] zkfcfbzr@lemmy.world 1 points 10 hours ago* (last edited 8 hours ago)

Do you follow the reasoning for why they set it up this way? The comments in this function from _xorg in keyboard make it seem like it expects K1 K2 K3 K4.

#: Finds a keycode and index by looking at already used keycodes
        def reuse():
            for _, (keycode, _, _) in self._borrows.items():
                keycodes = mapping[kc2i(keycode)]

                # Only the first four items are addressable by X
                for index in range(4):
                    if not keycodes[index]:
                        return keycode, index

I assume that second comment is the reason the person who wrote your function likes the number 4.

Which way is right/wrong here? It would seem at least part of the issue to me is that they don't make the list be K1 K2 K1 K2 as they say, since the function I quoted above often receives a list formatted like K1 K2 K1 NoSymbol.

Also, if I modify the function you quoted from to remove the duplications, I'm still finding that the first element is always duplicated to the third element anyways - it must be happening elsewhere as well. Actually, even if I modify the entire function to just be something nonsensical and predictable like this:

def keysym_normalize(keysym):
    # Remove trailing NoSymbol
    stripped = list(reversed(list(
        itertools.dropwhile(
            lambda n: n == Xlib.XK.NoSymbol,
            reversed(keysym)))))
    if not stripped:
        return
    else:
        return (keysym_group(stripped[0], stripped[0]), keysym_group(stripped[0], stripped[0]))

then the behavior of the output doesn't change at all from how it behaves when this function is how it normally is... It still messes up every third special character, duplicating the previously encountered special character

Later edit: After further investigation, the duplication of the first entry to the third entry seems to happen in the Xlib library, installed with pynput, in the display.py file, in the change_keyboard_mapping function, which only has a single line. Inspecting the output of the get_keyboard_mapping() function both before and after the change_keyboard_mapping function does its thing shows that it jumps right from [0, 0, 0, 0, 0, 0, 0, 0, 0, 0] to [keysym, 0, keysym, 0, 0, 0, 0, 0, 0, 0]. It's still unclear to me if this is truly intended or a bug.

[-] logging_strict@programming.dev 2 points 4 hours ago

The way forward is to make a unittest module (unfortunately author not using pytest). With characters that are taken as an example from each of the four forms.

THEN go to town testing each of the low level functions.

Suspect the test coverage is awful. mypy and flake8 also awful.

[-] logging_strict@programming.dev 1 points 4 hours ago

There are several forms

K1 NoSymbol K2 NoSymbol characters with lower/upper case forms

K1 K2 K1 K2 unicode <= 256 with no lower/upper case forms. Like | or + symbol

K1 K2 K3 NoSymbol 2 bytes latin extended character set

K1 K2 K3 K4 3 bytes like nuke radiation emoji

Non-authoritative guess. Having played around with xev together with onboard virtual keyboard with my symbols layout.

[-] logging_strict@programming.dev 1 points 4 hours ago

keysym 0 and 2 are for lower and upper case. If the character has an upper and lower case equivalents.

This is documented in keysym_group when it should be documented in keysym_normalize

In that case, the group should be treated as if the first element were
the lowercase form of ``K`` and the second element were the uppercase
form of ``K``.
[-] logging_strict@programming.dev 1 points 23 hours ago* (last edited 23 hours ago)

eggplant math cuz it isn't in SYMBOLS

>>> s = '''πŸ†'''
>>> dec_s = ord(s)
127814
>>> hex(dec_s)
'0x1f346'

From the source code

def char_to_keysym(char):
    """Converts a unicode character to a *keysym*.

    :param str char: The unicode character.

    :return: the corresponding *keysym*, or ``0`` if it cannot be found
    """
    ordinal = ord(char)
    if ordinal < 0x100:
        return ordinal
    else:
        return ordinal | 0x01000000

What a nutter! Comparing an int to a hex.

>>> int(0x100)
256
>>> int(0x01000000)
16777216

eggplant emoji keysym
>>> 127814 | 16777216
16905030
[-] logging_strict@programming.dev 4 points 6 days ago

v1.8.1 was release on 2025-03-17

So pynput is being maintained. BUT there are current 155 open issues and 23 PRs. This means nothing besides this is a popular package.

v1.8.1 Changes

  • Remove incorrectly merged line for the Xorg backend. Thanks to sphh!
  • Let events know about the new injected parameter. Thanks to phpjunkie420!

Mentioned that a PR dealt with a Xorg backend issue. Don't know if this addresses your particular issue.

What pynput version do you have in your venv?

Run this command please to get the package version

python -m pip list | grep pynput

Can understand your issue report would be buried in all the other issues. So help there might not come within a timely manner.

The next step would be to create a pytest file to be able to repeat the tests, rather than run them each time manually. Obviously logging in and out is not possible. But it's probably also completely unnecessary.

I use Xorg and Linux. Competent enough with pytest (author of pytest-logging-strict).

Before roll up my sleeves want you to confirm v1.8.1 still has this issue.

[-] zkfcfbzr@lemmy.world 2 points 3 days ago

Hey! Sorry for the very late reply. I've been checking the thread regularly and I swear just a few hours ago (when I made the cross-posts) it was still at 0 replies. I'm gonna blame federation issues.

The command you provided does indicate I'm on pynput 1.8.1, so I can confirm v1.8.1 has the issue.

[-] logging_strict@programming.dev 2 points 2 days ago

Took two days to think about your original post. Was thinking, hmmm this package and trouble you are having are both fresh and interesting.

Remote controlling both the mouse and keyboard seems worthy to spend time trying it out.

[-] zkfcfbzr@lemmy.world 2 points 1 day ago* (last edited 1 day ago)

Gonna make some notes since I made some progress tonight (so far).

Within pynput's keyboard's _xorg.py file, in the Controller class, self._keyboard_mapping maps from each key's unique keysym value, which is an integer, to a 2-tuple of integers. The actual keysym for each key in the mapping appears to be correct, but occasionally the 2-tuple duplicates that of another entry in self._keyboard_mapping - and these duplicates correspond precisely to the errors I see in pynput's outputs.

For example, 'π•₯' has keysym = 16897381 and '𝕖' has keysym = 16897366, but both 16897381 and 16897366 map, in self._keyboard_mapping, to the 2-tuple (8, 1) - and '𝕖' is indeed printed as 'π•₯' by pynput. (π•₯'s keysym appears first in self._keyboard_mapping). (The 2-tuple keysyms map to are not unique or consistent, they vary based on the order they were encountered and reset when X resets)

Through testing, I found that this type of error happens precisely every third time a new keysym is added to self._keyboard_mapping, and that every third such mapping always duplicates the 2-tuple of the previous successful mapping.

From that register function I mentioned, the correct 2-tuple should be, I believe, (keycode, index). This is not happening correctly every 3rd registration, but I'm not yet sure why.

However, I did find a bit more than that: register() gets its keycode and index from one of the three functions above it, reuse() borrow() or overwrite(). The problematic keys always get their keycode and index from reuse - and reuse finds the first unused index 0-3 for a given keycode, then returns that. What I found here is that, for the array keycodes, the first element is always duplicated to the third position as well, so indexes 0 and 2 are identical. As an example, here are two values of keycodes from my testing:

keycodes = array('I', [16897386, 0, 16897386, 0, 0, 0, 0, 0, 0, 0])
keycodes = array('I', [16897384, 16897385, 16897384, 0, 0, 0, 0, 0, 0, 0])

With this in mind, I was actually able to fix the bug by changing the line for index in range(4): in reuse() to for index in range(2):. With that change my script no longer produces any incorrect characters.

However, based on the comments in the function, I believe range(4) is the intended behavior and I'm just avoiding the problem instead of fixing it. I have a rather shallow understanding of what these functions or values are trying to accomplish. I don't know why the first element of the array is duplicated to the third element. There's also a different issue I noticed where even when this function returns an index of 3, that index of 3 is never used in self._keyboard_mapping - it uses 1 instead. I'm thinking these may be two separate bugs. Either way, these two behaviors combined explain why it's every third time a new keysym is added to self._keyboard_mapping that the issue happens: While they in theory support an index of 0 1 2 or 3 for each keycode, in practice only indices 0 1 and 3 work since 2 always copies 0 - and whenever 3 is picked it's improperly saved as 1 somewhere.

I may keep investigating the issues in search of a true fix instead of a cover-my-eyes fix.

[-] zkfcfbzr@lemmy.world 1 points 2 days ago

I agree and appreciate it. I've been trying to figure it out myself but feel a bit out of my element.

What I've found is that in pynput's keyboard's _xorg.py file, the Controller class's self._keyboard_mapping seems to map some different keycodes to the same value, and that seems to correlate exactly with the errors I'm seeing. I haven't figured out why yet. I got to thinking it had something to do with the register function in the _resolve_borrowing function but I forget why and I'm too tired to continue for now. I'll continue tomorrow though.

[-] logging_strict@programming.dev 1 points 23 hours ago

_util/xorg_keysyms.py

Contains mapping of keysym to unicode str

type this into the terminal, it'll open up a small window. With the window in focus, type.

xev -event keyboard

type 1

From xev

keysym 0x31, 1

Corresponding entry in pynput._util.xorg_keysyms.SYMBOLS

'1': (0x0031, u'\u0031'),

so the hex is minimum four places. So 0031 instead of 0x31

From xev

keysym 0xac9, trademark

Corresponding entry in pynput._util.xorg_keysyms.SYMBOLS

'trademark': (0x0ac9, u'\u2122'),

From xev

type in nuke radiation emoji

keysym 0x1002622, U2622 bytes: (e2 98 a2) "☒"

So three bytes instead of one or two bytes

From xev

(keysym 0x7c, bar)
1 bytes: (7c) "|"

Corresponding entry in pynput._util.xorg_keysyms.SYMBOLS

'bar': (0x007c, u'\u007C'),

this post was submitted on 25 Mar 2025
9 points (100.0% liked)

Python

6953 readers
3 users here now

Welcome to the Python community on the programming.dev Lemmy instance!

πŸ“… Events

PastNovember 2023

October 2023

July 2023

August 2023

September 2023

🐍 Python project:
πŸ’“ Python Community:
✨ Python Ecosystem:
🌌 Fediverse
Communities
Projects
Feeds

founded 2 years ago
MODERATORS