Linux下删除/重写/生成键盘事件

发布于 2024-09-24 13:01:18 字数 2052 浏览 4 评论 0原文

我想在 Linux 下挂钩、拦截和生成键盘（make/break）事件，然后再将它们传递给任何应用程序。更准确地说，我想检测关键事件流中的模式，并能够根据检测到的模式丢弃/插入事件到流中。

我已经看到了一些相关的问题，但是：

要么它们只处理如何获取关键事件（按键记录器等），而不是如何操纵它们的传播（它们只监听，但不拦截/产生）。
或者他们在 X 中使用被动/主动抓取（请阅读下面的更多内容）。

小型 DSL

我在下面解释了这个问题，但为了使其更加紧凑和易于理解，首先是一个小型 DSL 定义。

A_：用于按下（按下）键 A
A^：用于中断（释放）键 A
A^->[C_,C^,U_,U ^]：在 A^ 上发送 C 的通断组合，然后向 U 进一步发送处理链（最后发送到应用程序）。如果没有 -> 则不会发送任何内容（但可能会修改内部状态以检测后续事件）。
$X：执行任意操作。这可以发送一些可配置的按键事件序列（可能类似于 emacs 的 Cx Cs），或者执行一个函数。如果我只能发送关键事件，那就足够了，因为我可以根据哪个应用程序处于活动状态在窗口管理器中进一步处理这些事件。

问题描述

好的，通过这种表示法，以下是我想要检测的模式以及我想要沿着处理链传递的事件。

A_, A^->[A_,A^]：解释。参见上文，请注意发送发生在 A^ 上。
A_, B_, A^->[A_,A^], B^->[B_,B^]：与1基本相同，但重叠事件不会改变处理流动。
A_, B_, B^->[$X], A^：如果在按住另一个键 (A) 的同时完成了一个键 (B) 的接通/断开，则执行 X （见上文），并且 A 的中断被丢弃。

（原则上它是一个通过按键事件实现的简单状态机，它可以生成（多个）按键事件作为输出）。

附加说明

该解决方案必须以打字速度运行。
修改后的关键事件流的使用者在 Linux 上的 X 下运行（控制台、浏览器、编辑器等）。
只有键盘事件影响处理（没有鼠标等）。
匹配可以发生在键符号（更容易一点）或键码（更难一点）上。对于后者，我只需要读取映射即可从代码转换为键符号。
如果可能的话，我更喜欢一个既可以与 USB 键盘一起使用也可以在虚拟机内部使用的解决方案（如果在驱动程序层工作可能会出现问题，其他层应该没问题）。
我对实现语言持开放态度。

可能的解决方案和问题

因此，基本问题是如何实现这一点。

我使用被动抓取 (XGrabKey) 和 XSendEvent 在窗口管理器中实现了一个解决方案。不幸的是，被动抓取在这种情况下不起作用，因为它们无法在上面的第二个模式中正确捕获 B^。原因是转换后的抓取在A^处结束，并且没有继续到B^。如果仍然按住，但仅在大约 1 秒后，新的抓取就会转换为捕获 B。否则，将向应用程序发送普通的 B^。这可以使用 xev 进行验证。

我可以将我的实现转换为使用主动抓取 (XGrabKeyboard)，但如果窗口管理器始终在键盘上主动抓取，我不确定对其他应用程序的影响。 X 文档将主动抓取称为侵入性的并且是为短期使用而设计的。如果有人有这方面的经验，并且长期主动抓取没有重大缺点，那么我会认为这是一个解决方案。

除了窗口管理器（作为 X 客户端运行）之外，我愿意查看关键事件处理的其他层。键盘驱动程序或映射是可能的，只要我能用它们解决上述问题。这也意味着解决方案不必是单独的应用程序。我完全可以让驱动程序或内核模块为我做这件事。请注意，虽然我从未做过任何内核或驱动程序编程，所以我会欣赏一些好的资源。

感谢您的指点！

原文

I would like to hook into, intercept, and generate keyboard (make/break) events under Linux before they get delivered to any application. More precisely, I want to detect patterns in the key event stream and be able to discard/insert events into the stream depending on the detected patterns.

I've seen some related questions on SO, but:

either they only deal with how to get at the key events (key loggers etc.), and not how to manipulate the propagation of them (they only listen, but don't intercept/generate).
or they use passive/active grabs in X (read more on that below).

A Small DSL

I explain the problem below, but to make it a bit more compact and understandable, first a small DSL definition.

A_: for make (press) key A
A^: for break (release) key A
A^->[C_,C^,U_,U^]: on A^ send a make/break combo for C and then U further down the processing chain (and finally to the application). If there is no -> then there's nothing sent (but internal state might be modified to detect subsequent events).
$X: execute an arbitrary action. This can be sending some configurable key event sequence (maybe something like C-x C-s for emacs), or execute a function. If I can only send key events, that would be enough, as I can then further process these in a window manager depending on which application is active.

Problem Description

Ok, so with this notation, here are the patterns I want to detect and what events I want to pass on down the processing chain.

A_, A^->[A_,A^]: expl. see above, note that the send happens on A^.
A_, B_, A^->[A_,A^], B^->[B_,B^]: basically the same as 1. but overlapping events don't change the processing flow.
A_, B_, B^->[$X], A^: if there was a complete make/break of a key (B) while another key was held (A), X is executed (see above), and the break of A is discarded.

(it's in principle a simple statemachine implemented over key events, which can generate (multiple) key events as output).

Additional Notes

The solution has to work at typing speed.
Consumers of the modified key event stream run under X on Linux (consoles, browsers, editors, etc.).
Only keyboard events influence the processing (no mouse etc.)
Matching can happen on keysyms (a bit easier), or keycodes (a bit harder). With the latter, I will just have to read in the mapping to translate from code to keysym.
If possible, I'd prefer a solution that works with both USB keyboards as well as inside a virtual machine (could be a problem if working at the driver layer, other layers should be ok).
I'm pretty open about the implementation language.

Possible Solutions and Questions

So the basic question is how to implement this.

I have implemented a solution in a window manager using passive grabs (XGrabKey) and XSendEvent. Unfortunately passive grabs don't work in this case as they don't capture correctly B^ in the second pattern above. The reason is that the converted grab ends on A^ and is not continued to B^. A new grab is converted to capture B if still held but only after ~1 sec. Otherwise a plain B^ is sent to the application. This can be verified with xev.

I could convert my implementation to use an active grab (XGrabKeyboard), but I'm not sure about the effect on other applications if the window manager has an active grab on the keyboard all the time. X documentation refers to active grabs as being intrusive and designed for short term use. If someone has experience with this and there are no major drawbacks with longterm active grabs, then I'd consider this a solution.

I'm willing to look at other layers of key event processing besides window managers (which operate as X clients). Keyboard drivers or mappings are a possibility as long as I can solve the above problem with them. This also implies that the solution doesn't have to be a separate application. I'm perfectly fine to have a driver or kernel module do this for me. Be aware though that I have never done any kernel or driver programming, so I would appreciate some good resources.

Thanks for any pointers!

分享到QQ

分享到微博