Skip to main content

Command Palette

Search for a command to run...

Linux File System Hunting: 10 Discoveries That Changed How I See Linux

Updated
8 min read
Linux File System Hunting: 10 Discoveries That Changed How I See Linux
H
CS undergrad | Tech enthusiast | Focusing on Web Dev • DSA • ML | Building skills for real-world impact

I spent a few hours this week not running commands, but actually reading what Linux keeps lying around on disk. What I found was more interesting than I expected. Not because the files are complex, but because of how much the OS trusts you to figure things out yourself.

Here are ten things I found, what they do, and why they exist.


1. /etc/passwd is not a password file

The name is misleading. /etc/passwd does not store passwords. It stores user account metadata: username, UID, GID, home directory path, and default shell. The actual password hashes moved to /etc/shadow decades ago, when people realized world-readable files and password data should not coexist.

root:x:0:0:root:/root:/bin/bash
harsh:x:1000:1000:,,,:/home/harsh:/bin/bash

The x in the second field is just a placeholder. It means "go check /etc/shadow instead."

What I found interesting: every service account on the system shows up here too. daemon, syslog, nobody. These aren't real users, they're process isolation mechanisms. When a service runs as nobody, it has almost no file permissions, so even if it gets compromised, the damage is contained.


2. /etc/resolv.conf decides what happens when you type a domain name

Before your browser can connect to anything, Linux has to turn a hostname like google.com into an IP address. That lookup goes through /etc/resolv.conf.

nameserver 127.0.0.53
search home

On modern Ubuntu, 127.0.0.53 points to systemd-resolved, a local DNS stub resolver. It caches responses and handles fallback. You can inspect its actual upstream servers with:

resolvectl status

What I found interesting: if you replace 127.0.0.53 with something like 1.1.1.1 directly, DNS lookups bypass the local resolver entirely. Some corporate environments override this file to force traffic through their own DNS servers. This is how content filtering usually works on managed networks.


3. /etc/hosts runs before DNS and nobody talks about it

Before the OS even touches /etc/resolv.conf, it checks /etc/hosts. This file maps hostnames to IPs locally, with no network request at all.

127.0.0.1   localhost
127.0.1.1   my-machine

You can add your own entries:

192.168.1.10   dev.local

Now dev.local resolves instantly on your machine without touching DNS. The order of resolution (hosts file vs DNS vs mDNS) is controlled by /etc/nsswitch.conf, which most people never open.

What I found interesting: ad blockers used to work entirely through /etc/hosts. Map a thousand ad domains to 0.0.0.0 and they never load. It's crude but effective and requires no browser extension.


4. /proc is a filesystem that doesn't exist on disk

/proc looks like a directory. It has files you can cat. But none of it is stored anywhere. The kernel generates it on-demand when you read it.

cat /proc/cpuinfo       # CPU details
cat /proc/meminfo       # Memory stats
cat /proc/net/tcp       # Active TCP connections in hex

Each running process also gets a numbered folder. /proc/1234/ corresponds to PID 1234. Inside that folder:

cmdline   # The exact command that launched it
environ   # Environment variables at launch
fd/       # Open file descriptors
maps      # Memory map of the process
status    # Memory usage, parent PID, thread count

What I found interesting: /proc/net/tcp lists active TCP connections in a format that's technically readable but intentionally inconvenient: local and remote addresses in little-endian hex. Tools like netstat and ss just parse this file and translate it into something human-readable. There is no magic.


5. /etc/fstab controls what gets mounted at boot, and also defines the mount shorthand

Whenever Linux boots, it reads /etc/fstab to know what filesystems to mount where.

UUID=abc123   /         ext4   defaults   0   1
UUID=def456   /boot     vfat   umask=0077 0   1
tmpfs         /tmp      tmpfs  defaults   0   0

The last two columns matter. The second-to-last is the dump flag (mostly ignored today). The last is the fsck order: 1 means check first, 2 means check after, 0 means skip. / gets 1. Everything else gets 2 or 0.

What I found interesting: /tmp is often a tmpfs mount, which means it lives entirely in RAM. Nothing written to /tmp survives a reboot. This is intentional. It's why you should never rely on /tmp for anything persistent, and why clearing /tmp on a running machine is pointless if the mount is tmpfs.


6. /etc/sudoers is more nuanced than "who can use sudo"

Most people know /etc/sudoers controls sudo access. What it actually controls is much more specific.

%sudo   ALL=(ALL:ALL) ALL
harsh   ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart nginx

That second line lets me restart nginx without a password, but nothing else. I can't edit the config, can't run arbitrary commands as root. Just that one binary.

The file uses a syntax where (ALL:ALL) means "can run as any user and any group." Misread this file and you either lock yourself out or hand someone full root access without realizing it.

What I found interesting: you should never edit this file directly with a text editor. If you introduce a syntax error, sudo stops working entirely, including the commands you'd need to fix it. Use visudo instead, it validates syntax before saving.


7. /var/log/auth.log keeps a record of every login attempt

Every successful and failed SSH login, every sudo invocation, every su command gets written here.

grep "Failed password" /var/log/auth.log | head -20

On a VPS exposed to the internet, you'll find hundreds of failed login attempts from automated bots. They try common usernames (admin, ubuntu, pi) against common passwords, constantly.

What I found interesting: the timestamp distribution tells you something. Most automated attacks come in bursts, with thousands of attempts in a short window. A single failed attempt from an unusual location is more suspicious than a thousand identical attempts from the same IP, because the single one might be targeted.


8. /etc/systemd/system/ is where services actually live

Systemd manages every service on a modern Linux system. Unit files are stored in /lib/systemd/system/ (system defaults) and /etc/systemd/system/ (overrides and custom services).

A minimal unit file looks like this:

[Unit]
Description=My App
After=network.target

[Service]
ExecStart=/usr/bin/node /opt/app/index.js
Restart=always
User=nodeapp

[Install]
WantedBy=multi-user.target

After=network.target just says "don't start until the network is up." Restart=always means systemd will restart the process if it crashes.

What I found interesting: WantedBy=multi-user.target is how you tell systemd to start the service at boot. Running systemctl enable myapp creates a symlink from that target's .wants/ directory to your unit file. If you remove the symlink manually, systemctl disable just has nothing to clean up.


9. /etc/environment sets variables for every process, system-wide

There are multiple places to set environment variables on Linux: .bashrc, .profile, /etc/profile, /etc/environment. They're not equivalent.

/etc/environment is the system-wide, login-session version. It's not a shell script. It cannot have logic or conditionals. Just KEY=value pairs.

JAVA_HOME="/usr/lib/jvm/java-17-openjdk-amd64"
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

What I found interesting: systemd services do not inherit these variables unless you explicitly configure it. A variable in /etc/environment won't show up in a service started by systemd unless you add it to the unit file's [Service] section with Environment= or EnvironmentFile=. This trips people up constantly when a service works on the command line but fails when run as a unit.


10. /boot/grub/grub.cfg is the first thing that runs and you probably never look at it

GRUB is what loads before Linux even starts. Its config lives here, and it's generated from /etc/default/grub plus scripts in /etc/grub.d/. You're not supposed to edit grub.cfg directly because update-grub overwrites it.

cat /etc/default/grub
GRUB_TIMEOUT=5
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"

quiet suppresses kernel messages during boot. splash shows a graphical boot screen. Remove both and you'll see everything the kernel logs during startup, which is the actual useful diagnostic information.

What I found interesting: kernel parameters you've never configured are silently being applied every boot. GRUB_CMDLINE_LINUX_DEFAULT is where you'd add things like nomodeset (to fix graphics driver issues), mem=4G (to cap memory), or systemd.unit=rescue.target (to boot into rescue mode without a password). Most Linux troubleshooting guides eventually lead here.


What I came away thinking

Linux does not hide what it's doing. The config files, the process details, the network state, all of it is readable text in predictable locations. The assumption is that whoever is running the machine is curious enough to look.

The interesting insight from this exercise is not any single file. It's that every tool you use on Linux is just reading these files and presenting the data differently. ip route reads the kernel routing table. ps reads /proc. systemctl status reads unit files and journals. Understanding the underlying files makes the tools optional, not required.

Happy coding! 🚀

If you enjoyed this article, check out my other blogs on this profile. 🔗 Connect with me: LinkedIn | GitHub | X (Twitter)