Lessons in destroying a perfectly fine Debian installation

2024-03-03

So about that AMD Framework mainboard upgrade. Well, actually in the end it did go fine, from a hardware perspective, at least. But the last few things in that post are about software, specifically getting my Debian installation working correctly with the new hardware. It sure seemed like it worked fine, and for some reason, really did okay the rest of Friday. But, Saturday, I decided it'd be fun to finally switch to Sway. It's the default desktop environment on the MNT Reform's Debian sid image (which I had for a while, but could never find much of a use for, and so eventually sold it), and I enjoyed using it on that device a lot. The thing is, it was preconfigured there. And I just haven't had time to sit down and fiddle with something like Sway, so I never bothered, and just stuck with KDE. And KDE is great, if I wasn't, for some unknowable reason, compelled to torture myself with configuration files and obscure, un-searchable problems, I'd just stick with KDE and be happy for the rest of my days.

Anyway, I was working on getting sway installed and configured using KDE, but every ~15-20 minutes or so, my wifi would go down. I'd stay connected to my LAN, and I could access other computers on the network, and even DNS seemed to work. But the internet access itself wouldn't work. I could ping google.com just fine, I'd get a sensible and correct IPv6, and a successful ping response. But if I, say, wget www.google.com, the request would just hang. The download speed would start at my network's expected speed, and then slowly drop, until it eventually just hit 0 (presumably, I never actually let it run that far, I never got data back anyway). I couldn't make requests in Firefox either. In KDE I could somewhat reliably disable the Wifi entirely and reconnect to the network, and I'd be back in action. No other devices had this issue, so I was convinced it was something to do with the AX210 Intel Wifi chip I kept from the default 11th gen build and moved onto the AMD board. That chip is supposed to (and indeed, in the end does) work fine with any motherboard. There was an issue back on kernel 5.10, but that was resolved upstream and I'm running the latest Debian kernel, 6.1. I had a dmesg log about being unable to load a *-yoyo.bin firmware file, but this turns out to be a harmless warning, safe to ignore. I hunted for ages trying to find any hardware configuration or firmware reason that could be causing this behaviour. Doing that, I of course ran across heaps of IPv6 related and DNS issues. I ignored the DNS ones, because DNS was working. NetworkManager was configuring the right DNS servers in /etc/resolv.conf, and they worked fine. I could resolve domains by pinging them, and get correct values. In any case, I did have some issues with IPv6 on my Raspberry Pi once, which runs a Debian based distro, so I figured, okay, I'll disable IPv6. Well, that was a load of nothing. I tried the GRUB command approach, I tried the NetworkManager configuration approach, I tried disabling it on my router, I tried so many things. All the while, every other device works fine. And for some reason, I never tried connecting to another wireless network (like my phone's hotspot), so I remained convinced it was an issue with my hardware, or my Debian installation.

Well, I though it was the Debian installation, because for a bit I could reboot the computer into the Debian rescue image and make successful network requests from the installer. I did not want to have to reinstall, so I started gutting things, trying to get everything back to "default" as best I could. But really, it was already there. Everything I tried either flipped a meaningless toggle, which I'd then flip back, and did nothing. Everything was configured right. So then I though, maybe it's Debian. I pulled out my Arch installer USB, loaded it up, iwctl'd my way to a connection, and yeah, nope, not a Debian issue. I couldn't make any network connections on the Arch install image either.

So then, for some reason, finally, I tried connecting to my phone hotspot. Surely, my home network is fine. I haven't messed with it. The settings are the defaults, for the most part, and the ones I've changed can't have caused an issue like this, and don't cause issues for any other device. Everything worked on my phone hotspot. Annoying! I should have just tried that earlier, and in a way, that's the lesson. I was too hung up on it being a hardware issue because coincidentally I started to have the issue when I changed hardware. I thought, surely it's a compatibility issue, surely I didn't seat the Wifi card correctly, surely the antennae are loose, surely, surely, surely, it isn't the thing I could have checked in two seconds and then fixed in only four. But yeah, it was. I shouldn't have ruled it out, and I shouldn't have ignored it.

Oh well. I did a hard reset on our router, and yeah, it's working fine. By this point I'd broken so many things trying to both (mis)configure Sway, and get the Wifi working again, I'd already started reinstalling Debian. I was configuring Sway, and everything worked great at first. Until it didn't. First, swaybg stopped working, and I'd get an error like "Failed to load background image" and "Couldn’t recognize the image file format". I found an issue on GitHub that looked related, but the solution there was to upgrade to a newer version of swaybg and I was already on a much later version of it anyway.

Then, after I installed Firefox, I was trying to log into work and WordPress Make Slack so I could stay in the loop about WC Asia stuff for next week. But Firefox wouldn't open Slack after I logged in in the browser. I wouldn't even get the little dialogue asking me if it wanted me to open Slack. When I tried to modify the slack mime type under the Firefox preferences' Applications section, Firefox would crash. I ran Firefox from the terminal to see if there were useful logs, and it was flooded with logs about pixbuf and a bunch of failed assertions, and being unable to load icons and other image files. The log at the top was something like this, copied from the GitHub issue where I finally found a solution _to the problem I had one-hundred percent myself created, through sheer ignorance and presumably malice towards my mental stability:

(bottles:4704): Gtk-WARNING **: 20:57:03.882: Could not load a pixbuf from /org/gtk/libgtk/icons/16x16/status/image-missing.png.
This may indicate that pixbuf loaders or the mime database could not be found.

The solution in that GitHub issue is to fix XDG_DATA_DIRS. Oh, I remember that environment variable. I'd messed around with it a bit trying to get bemenu-run to list flatpaks. Well, I didn't know what XDG_DATA_DIRS was, and I read some GitHub issue or other about getting Flatpak *.desktop files and somehow eneded up setting XDG_DATA_DIRS in my ~/.zshenv to:

export XDG_DATA_DIRS="/var/lib/flatpak/exports/share/applications"

Yeah, that's not going to work. For one, I don't have any flatpaks installed on the new Debian installation yet anyway, even that particular directory doesn't exist. And second, if I wanted to include flatpak's exported share directories, I'd need to set it to the share directory, not share/applications!

Well, if I'd just RTFM'd, I'd have known this was all wrong. The issue in the end was that I didn't have the default /usr/local/share:/usr/share variables at all anymore, as I'd gone and set it to the wrong (nonexistent) directory. I've fixed it to XDG_DATA_DIRS="$HOME/.local/share:/usr/local/share:/usr/share" and now all is well. I can have a desktop background, I can open files/links to other programs in Firefox, and well, (almost) everything just works now. Yay!

Almost, because there's something causing the font-awesome icon used for the RAM monitor in the waybar configuration I'm using as a base not to work. I'm sure that's an easy fix.

Lessons learned:

  1. Do not ignore an easy troubleshooting step just because of another presumably more directly related issue must be the problem. Just try the simple stuff first.
  2. Don't change settings without knowing what they do or what they're for. Stop that!