Did you ever experience this kind of scenario?
You wake up, do your usual morning routine. Wake your machine up, glance at the server dashboard where everything seems fine. Why wouldn’t it be when you just used it last night before sleep? But its not. Everything is not working.
This is what happened to me today. I was having my usual morning routine, and upon opening my media server I got an error message Playback error
which is weird because I played this same media file last night
How? Why?
I use this server everyday. I used it last night. And when I woke up it doesn’t work anymore.
So then, time to spend my weekend putting on my imaginary ‘deer stalker’ hat and Sherlock the crap out of this.
Docker
First suspect is of course, the one who should be running things: Docker.
I spent a good amount of time analyzing my docker-compose
file. Everything seems in order. I changed the volume mounts of the media files for my jellyfin server in case this is a permission issue.
No change.
I changed the ports because maybe something is interfering with the port number of jellyfin.
Problem persists.
I then tried to upgrade the docker and docker-compose version.
No dice.
So okay.. maybe docker is innocent, but I’m keeping a close eye..
Jellyfin
Next suspect: Jellyfin
Maybe the application had an update? Maybe some configuration changed along with the update? I scoured their official github page and their official website for references but there’s no such update. If anything, Jellyfin is quite stable.
I then checked the page if there are misconfigurations that I somehow did. But, the server is working for almost a month now. Why now? Why today?
Jellyfin seems good.. moving on.
Linux (Ownerships and Permissions)
After checking Jellyfin and coming up empty, I decided to focus my attention to the OS. I checked some files in my media directory. Yeah, some files still have root:root
ownerships with 700
permissions. Maybe if I change that, the error will be fixed.
sudo chown -R dufresne:dufresne /opt/jellyfin_media
sudo chmod -R 755 /opt/jellyfin_media
Still nothing! Same error.
Browser compatibility
I maybe grasping at straws with this one but better to leave no stones unturned.
I used every browser I know of: Google Chrome, Safari, Firefox, Arc, Opera
Yup, same issue. At this point, I am close to making a new virtual machine and delete the old one, but should the issue come up again, would I remake another VM in order to fix things? No. I don’t need a band-aid. I need a fix..
Linux (Disks)
My homelab has a passthrough to my old external hard drive. The hard drive itself is quite old. So maybe this is a hardware issue? To test this, I plugged my external hard drive in my other machine and spin up a jellyfin container just to test if the media is playable.
And what do you know? It played.
I learned two things from this:
- The disk is okay. (Kudos to Western Digital, this has been my external hard drive since 2013)
- The issue is isolated within my homelab. Since the media played in my other machine.
There’s hope after all..
Proxmox
Since this maybe a homelab issue, I dove into my homelab OS: Proxmox.
I checked if I am running out of storage. Thankfully I am not. I checked for VM settings, I’ve read somewhere that sometimes disks re-mount themselves, coming up with different UUIDs. Different UUIDs are problematic in a media server since the metadata is referencing the disk mounts.
I cannot find evidence that the disks re-mounted so what I did was delete all the metadata in the jellyfin volume mount and created a fresh container instance of jellyfin.
I waited for half an hour for Jellyfin to complete the metadata downloads (movie/show posters, casts, ratings, synopsis, etc), and then I tried to play a movie.. Same error.
This is insane. I essentially wiped the entire docker instance and recreate a fresh one. And still it does not work.
Alternatives
I decided to cool my mind and watch a show on my phone instead. Since Jellyfin has a mobile app, I just have to enter the server’s address and I can watch from my phone.
Then it hit me. All media files are working..
I entered the server’s address using the IP:Port combo since I did not put the DNS server in my phone’s network.
I immediately test it out. I go to my browser, instead of using jellyfin.homelab.local
, I used 192.168.x.x:8096
.
And it worked!!
Now I know this is not fixed yet, but the pool of suspects just became very small.
To recap, this is how DNS works in my homelab:
Clearly, the suspects are between these two: Pihole and Nginx Servers
Since Nginx is the reverse proxy, I decided to first test my DNS Pihole.
The way my DNS is setup is all the domain names are pointed to one IP Address: Nginx. This way, Nginx can be the reverse proxy since pihole DNS cannot handle port numbers.
For my testing, I changed the IP address of jellyfin.homelab.local
to directly point at the jellyfin server instead of Nginx:
Because Pihole cannot handle port numbers, I have to manually enter it in the URL: jellyfin.homelab.local:8096
And again, everything works!
Pihole is eliminated from the suspect list. All that remains is the Nginx server.
The Problem
Upon ssh
-ing inside the server, I did all of the applicable steps I did in troubleshooting the jellyfin docker: Inspect the docker-compose
file, double-check the file ownerships and permissions of the volume mounts, check port disks issues.
And there it was.. As I checked the server’s disk utilization:
The disk was fully utilized, and so nginx cannot properly do its main function because it cannot save new data anymore.
In hindsight, this may have been a very short investigation had I went over the basics first instead of the complicated ones. Oh well.. live and learn.
I learned a lot with this experience. This is the good thing about having a homelab. I am always learning, re-learning and realizing new things everyday.