Linux's Automation Fragmentation

Linux's Automation Fragmentation

Posted on 2020-08-04

Note: I don't exactly think this post has that much substance in it as I was pretty distracted while writing, and didn't exactly have the best arguments around. I still think the underlying "concept" is a real thing (and that's why this is still going up), but this exact post might not be the best way to share it.

I recently had to deal with AutoHotkey (shortened to AHK) again, as I decided to reformat my Windows VM to continue my tradition of unnecessarily reformatting stuff when I get bored. My specific problem was I had swapped Caps Lock and Escape on my host Linux system, and got really used to that combo even in this short of a timespan⁰. Unlike Sway, Windows doesn't have that feature built in.

After setting up the mappings, I decided to check out some discussion sites relating to AHK, looking around I had this question in my head: Why does it feel like people do a lot more tweaking with AHK (which, while very powerful, isn't exactly the most pleasant tool to work with) than what most Linux environments can provide with their open-ness?

While it could be that most Linux people just don't talk about how their workflow works, I'm not entirely sure that's the answer.

I also used to use tools like AHK. Specifically one called Tasker. It's a paid "app" for Android that is essentially the de-facto automation tool.

Like AHK, Tasker isn't that intuitive¹, but it is very powerful, and has extensions that make it more powerful than it is by default.

I used it for a LOT. One time I made an entire schedule displayer for my school classes, where it would display the current class, the time left until the break, and all that, copied and pasted for all 5 week days². More recently, since I no longer use my phone all that much as my computer is always around, I've kept it simple only with stuff like alarms that can only be disabled via NFC triggers³.

Now, why am I talking about all this? Because I haven't done anything like these under Linux, ever.

Why? I personally think the problem is this:

There are too many tools!

There are too many tools with each doing something different. This isn't all that bad and can actually be pretty good because you can compose them however you like. The problems I have in mind however, are:

There isn't any clear way to compose them all

Yes, you have the shell, and yes, it can do a lot, but most automation works by responding to events, like key presses, or button clicks, or receiving a message.

For custom events, some programs will, if you're lucky, have settings on custom event listeners, which might call external tools.

For key presses, all desktop/window environments have different settings, and global key listener tools like sxhkd might break them if installed on top. Also if you're under Wayland you cannot listen for keys _at all_ unless you use compositor specific settings⁴.

Wayland also has problems⁵ like external programs not being able to interact with other programs, or draw overlays, which also limit what can be done on programs that don't have specific extension capabilities.

Also sometimes you want to do something only if a specific program is in focus. Good luck.

And even if you actually go through the trouble of going through all that and managed to get a program to call your handler...

All of these tools are another thing to learn

Well, how do you move the mouse on a button? If you're on X, you can use something like `xdotool` to move the mouse, but how do you get where the button is? You'll probably have to use some accessibility APIs to actually find and click the button⁶.

You want to move a window? Now you need to figure out `wmctrl`.

You want to pop up a tiny GUI? Either you use something like dmenu/rofi, or you're going to have a bad time.

You Cannot Discover Them by Accident

When using Tasker for example, I liked being able to go through the list of triggers or actions to see what I could do. You cannot do that with the current tools, since they are all separate. And that's even if they are installed in your computer to begin with.

What about AutoKey?


While AutoKey exists, and definitely addresses some issues I talked about, it's main focus is on keyboard macros and shortcuts.

Also it still has the discoverability issue, as most of the interesting things you can do with it are with shell commands.


0: Assuming I tooted the same day I made the change, it should be exactly a month plus 4 days since then.

1: It's definitely "simpler" than AHK because it's GUI based, but it also has an interesting learning curve.

2: Writing this, I just realized I could've probably just made it fetch data from the calender, would be a LOT easier.

3: Do things that do this already exist? Yes. Could I be bothered to just download one and use them? No.

4: Well, you can _technically_ just read off of evdev via the /dev/input files, but why?

5: Some might call them security improvements, but I don't want that debate.

6: There might be tools that do this for you, I haven't researched that much.

You seem to be blocking JavaScript, keep up the good work!
This message is just here to remind you that my blog also is available via Gemini, if you wish to read it through a protocol that's lighter and has more than three proper browsers.

Replace with gemini:// on the page address to read it from there.
Send comments and replies to ~admicos/ (mailing list etiquette)