Engineering Backend · 6 min read

When Your Config Is Correct But Your App Disagrees

A ten-hour debugging hunt through stale builds and silent environment binding — and the experiments that finally cornered the bug.

It had been years since a bug truly stuck me — the kind where you close the laptop, reopen it an hour later, and the problem is still sitting there grinning at you.

This one ran ten hours. It was also my first Java/Kotlin project, which is part of the story. And I have to admit: I enjoyed the puzzle. That feeling of circling something that should be impossible, then slowly cornering it, is a big part of why I got into this in the first place.

The bug: image uploads to Cloudflare R2 kept failing with a 500. Every config file was correct. Every environment variable was set. The app still pointed at the wrong server. Here's what the hunt taught me — lessons that apply to any backend, with a Spring Boot flavor.

The setup

A Spring Boot service uploads property photos to Cloudflare R2 (S3-compatible storage). Uploads returned HTTP 500. The stack trace pointed at an image-resizing library throwing UnsupportedFormatException. Obvious culprit, right? Remove the resizing, redeploy, done.

It wasn't done. That was lesson zero: the loudest error is not always the real one.

Lesson 1: Verify what's running, not what's written

The resizing code had already been deleted in source. Yet the crash kept firing with fresh timestamps. The reason was embarrassingly simple: the running .jar was an old build. Every fix made over the previous hours lived in source files that were never recompiled into the deployed artifact.

A config file on disk, a corrected line in application.yml, a fix in a .kt file — none of it matters until it's built and the process is restarted from the new build. When behavior contradicts your source, your first question should be: is the process even running this code?

How to check, concretely:

  • Confirm the build artifact's timestamp is newer than your last change.
  • Read the config inside the running artifact, not the source tree. For a Spring Boot jar: unzip -p app.jar BOOT-INF/classes/application.yml.
  • Grep the live logs for the symptom with a recent time filter.

Lesson 2: Trust the process environment, not the file

Once the jar was fresh, a second bug surfaced. The app logged its storage endpoint as localhost:9000 instead of the R2 URL — even though the environment variable was clearly set in the .env file the service loaded.

So I checked what the process actually had, not what the file said:

Bash
cat /proc/<pid>/environ | tr '\0' '\n' | grep -i storage

The variable was right there in the process environment. And yet Spring still resolved the localhost default. That's the kind of result that feels impossible — and "impossible" is a signal you're testing the wrong layer.

Lesson 3: Force the value at the highest precedence to localize the fault

When config resolution misbehaves and you can't see why, stop reading config and run an experiment that cannot be overridden. In Spring, JVM system properties (-D) sit at the top of the precedence order, above environment variables and YAML defaults:

Bash
java -Dapp.storage.endpoint=https://...r2.cloudflarestorage.com -jar app.jar

It logged the R2 endpoint immediately. That one test split the problem cleanly in two: the jar and the application code were fine; the fault was entirely in how the service-managed process bound an environment variable to a property. I'd spent hours suspecting the code. The experiment exonerated it in fifteen seconds.

Lesson 4: A working fix and a perfect fix are different things

The fix was to inject the storage values as -D system properties directly in the systemd unit, bypassing whatever was preventing the environment file from binding. It works. It's also not elegant — secrets in ExecStart are visible in ps, and the underlying env-binding mystery remains. But shipping a working fix and filing the cleanup separately beats staying blocked while you chase elegance.

The meta-lesson

The code was never broken. Ten hours evaporated because each correct fix was applied to something that wasn't actually running — first a stale binary, then an env mechanism that silently wasn't binding.

Worth being honest about one thing: I worked this through with an AI assistant, and it didn't spot the cause instantly either. It floated a few wrong theories — a nested-placeholder quirk, a phantom properties file, a profile override — before we landed on the truth. What actually moved us forward wasn't a flash of insight from either of us. It was running experiments that could rule things out: checking the live process environment, reading the config baked into the running jar, forcing the value with a -D flag. The lesson there is its own lesson: don't expect a tool — or yourself — to one-shot a layered bug. Expect to test your way down to it.

When config looks correct but behavior says otherwise, don't re-read the source a seventh time. Interrogate the running process directly: what artifact is it executing, what environment does it actually have, and what does it do when you force the value past every layer that could be lying to you?

Have a puzzle?

If your project involves complex requirements that standard platforms can't handle—let's talk.