on this page
- Why your dev image is not your prod image
- Smaller images, and what you give up
- Multi-stage builds: build big, ship small
- Stop running as root
- Health checks: “running” is not “working”
- Restart policies: come back after a crash
- Logs go to stdout, not into the container
- Secrets: an image is shippable, so a baked-in secret is a leaked secret
- Scan the image before you trust it
- Why npm run dev is not a production command
- Common mistakes
- Tying it back to shipping something real
Your container works. It builds, it runs, you can hit localhost:3000 and the app loads. So you tag it, push it to a registry, point your host at it, and call it production. It even works for a while. Then it falls over at 2am, doesn’t come back on its own, the logs are nowhere to be found, and the image you shipped turns out to contain your database password and 900MB of stuff nobody needed.
None of that is a Docker bug. It’s the gap between the container that’s fine on your laptop and the container you actually ship. If you’re still fuzzy on what an image or a container even is, go read Docker without the buzzwords first, then come back. This guide assumes you know docker build and docker run and want to know why the version of those you’ve been using is not the version that should be touching real users.
Let’s walk through what changes between dev and prod, one piece at a time, with the reason for each.
Why your dev image is not your prod image
A dev image is built for you, sitting at your keyboard, wanting fast rebuilds and a working terminal. A production image is built for a server you’ll never log into, wanting to be small, boring, and hard to break into. Those are different goals, and chasing both with one image is how you end up shipping a 900MB container that runs a file-watching dev server as the root user.
Here’s the same Node and Express app, the naive way and the production way, lined up so you can see what actually moves.
| Axis | Naive dev image | Production image |
|---|---|---|
| Base | node:20 (full Debian, ~1GB) | node:20-alpine (~50MB base) |
| Final size | 900MB+ | 150MB or less |
| Contents | all deps, build tools, source | built output + prod deps only |
| User | root | a non-root user you created |
| Command | npm run dev | node dist/server.js |
| Secrets | .env copied in or ENV baked | injected at runtime by the host |
| Logs | to a file inside the container | to stdout and stderr |
Every row in that table is a decision, not a default. Docker’s defaults are tuned for “make it run,” not “make it safe to leave running.” Let’s go row by row.
Smaller images, and what you give up
The official node:20 image is full Debian with a complete build toolchain. Handy when you’re poking around, but at runtime your app does not need a C compiler, git, or the headers for libraries it already finished linking against. All of that is weight you upload, store, and pull on every deploy.
Two easy wins:
- Start from a slim or alpine base. Alpine is a tiny Linux distribution (about 5MB), so
node:20-alpineis a fraction of the full image. - Don’t ship your dev dependencies. Things like your test runner, ESLint, and TypeScript matter while building and mean nothing once the app is built and running.
The tradeoff with Alpine is real, so I’ll say it plainly. Alpine uses musl instead of glibc (two different implementations of the standard C library that programs link against). Most Node packages don’t care, but a few native modules ship prebuilt binaries for glibc only and will either fall back to a slow compile or break outright. If you hit a weird native-module error on Alpine, that’s usually why, and node:20-slim (Debian, still much smaller than the full image, still glibc) is the safe middle ground. Start with Alpine, drop to slim if something native fights you.
Multi-stage builds: build big, ship small
Here’s the tension. To build a Node app you need everything: all dependencies, the source, the compiler. To run it you need almost none of that, just the built output and your production dependencies. A single-stage Dockerfile can’t have it both ways, so it keeps all the build junk in the final image forever.
A multi-stage build fixes this by using more than one FROM in the same Dockerfile. The first stage (call it the builder) installs everything and produces your build. The second stage starts fresh from a clean base and copies only the finished artifacts out of the builder. The builder, with all its bulk, gets thrown away. Only the lean final stage ships.
Here’s a production-minded Dockerfile for a Node and Express app that compiles TypeScript to a dist/ folder.
# ---- Stage 1: builder ----
# Full toolchain lives here. This whole stage gets discarded.
FROM node:20-alpine AS builder
WORKDIR /app
# Copy lockfiles first so this layer caches when only source changes.
COPY package*.json ./
# Install ALL dependencies, including devDependencies, to build the app.
RUN npm ci
# Copy source and build (e.g. tsc compiling src/ into dist/).
COPY . .
RUN npm run build
# Drop devDependencies now that the build is done, leaving only
# what runtime actually needs in node_modules.
RUN npm prune --omit=dev
# ---- Stage 2: runtime ----
# A clean, separate image. None of the builder's clutter comes along.
FROM node:20-alpine AS runtime
WORKDIR /app
ENV NODE_ENV=production
# Create a non-root user and group to run the app (more on this below).
RUN addgroup -S app && adduser -S app -G app
# Copy ONLY the built output and pruned production deps from the builder.
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./package.json
# Stop being root before the app ever runs.
USER app
# Tell the orchestrator the app reports its own health (see below).
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s \
CMD node -e "fetch('http://localhost:3000/healthz').then(r=>process.exit(r.ok?0:1)).catch(()=>process.exit(1))"
EXPOSE 3000
# Run the built app. Not the dev server. Never the dev server.
CMD ["node", "dist/server.js"]
The key line is COPY --from=builder. That’s the seam between the two stages. Everything the builder downloaded and compiled stays in the builder and never lands in the image you ship. Your final image is the runtime base plus your dist/ folder plus production node_modules, and nothing else.
A quick note on caching, because it carries straight over from the basics guide: copy package*.json and run npm ci before you copy your source. Your dependencies change far less often than your code, so Docker can reuse that cached install layer on most rebuilds instead of redownloading everything every time you fix a typo.
Stop running as root
By default, the process inside your container runs as root, the superuser that can do anything on the system. That feels harmless because it’s “just a container,” but a container shares the host’s kernel, and root inside the container is a much shorter hop to trouble on the host than an unprivileged user would be. If someone finds a hole in your app, you’d rather they land as a user who can barely do anything than as root.
The fix is two lines, both already in the Dockerfile above. Create a user, then switch to it with USER before your CMD.
# Alpine's syntax: -S makes a system user/group with no password or login.
RUN addgroup -S app && adduser -S app -G app
USER app
From that point on, the app runs as app, not root. One thing that bites beginners: do this switch after you’ve finished installing things, because a non-root user can’t write to system directories or run apk add. Build and install as root, then drop privileges right before the app starts. The order in the Dockerfile is the order it happens.
Health checks: “running” is not “working”
Your process can be alive and your app still broken. The Node process is up, but it’s stuck in a loop, or it lost its database connection and every request returns a 500. As far as a naive supervisor is concerned, the process is running, so all is well. It is not well.
A health check is a small command the orchestrator runs on a schedule to ask the app “are you actually okay?” rather than just checking that the process exists. You expose a route like /healthz that returns 200 when the app can really serve traffic, and the platform pings it. If it fails enough times, the platform knows to restart or stop sending traffic.
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s \
CMD node -e "fetch('http://localhost:3000/healthz').then(r=>process.exit(r.ok?0:1)).catch(()=>process.exit(1))"
That checks every 30 seconds, fails a check that takes longer than 3 seconds, and gives the app a 10-second grace period on startup before counting failures. Exit 0 means healthy, anything else means sick. Hosted platforms like
Restart policies: come back after a crash
Things crash. A memory spike, an unhandled exception, the host rebooting for maintenance. On your laptop you just run the container again. In production nobody is watching at 2am, so the platform has to bring it back for you.
In Docker Compose that’s one line:
services:
app:
build: .
restart: unless-stopped
ports:
- "3000:3000"
restart: unless-stopped means “if this container dies, start it again, but if I deliberately stop it, stay stopped.” That last part matters, because the cruder restart: always will fight you and relaunch a container you stopped on purpose. Managed hosts do this automatically and you don’t write it yourself, which is one of the reasons people reach for them.
Logs go to stdout, not into the container
Reflex says write logs to a file, app.log or similar. Inside a container that’s close to useless. The container is disposable, so when it’s replaced the file goes with it, and your platform’s logging tools won’t see a file buried in a container’s private filesystem anyway.
The container convention is to write logs to stdout and stderr (standard output and standard error, the two text streams every program gets for free). The platform captures those streams and routes them to wherever you read logs. console.log and console.error in Node already go there, so most of the time the move is to stop redirecting logs into a file and let them flow to the terminal. Then docker logs, Railway’s log view, Render’s log view, and friends all just work.
Secrets: an image is shippable, so a baked-in secret is a leaked secret
This is the one that ends up on the internet, so read it twice. An image is built to be copied, pushed to registries, and pulled by machines you don’t control. Anyone who can pull your image can also unpack its layers and read what’s inside. So a secret baked into the image isn’t being hidden. It’s being handed out to everyone who pulls the image.
That rules out a few tempting shortcuts:
- Don’t
COPYyour.envfile into the image. - Don’t hardcode keys with
ENV DATABASE_URL=...in the Dockerfile. - Don’t commit secrets to the repo your image builds from.
Anything in a Dockerfile layer can be read back with docker history and a little poking, even if a later instruction appears to delete it. The layer is still there underneath.
Instead, inject secrets at runtime through the platform’s secret store. Railway, Render, and Cloudflare all have a place to set environment variables and secrets that get handed to the container when it starts and are never part of the image. Locally with Compose you point at an env file that lives only on your machine and stays out of git.
services:
app:
build: .
env_file: .env # local only; add .env to .gitignore
environment:
- NODE_ENV=production
db:
image: postgres:16-alpine
environment:
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD} # read from your shell or .env
The mental model: build-time is public, runtime is private. Configuration that’s safe to ship can live in the image. Anything that grants access (database URLs, API keys, tokens) gets handed in when the container starts, by something that isn’t the image.
Scan the image before you trust it
Your image is a small Linux system plus your dependency tree, and both pick up known vulnerabilities over time. A package that was clean when you wrote the Dockerfile can have a reported flaw six months later. You don’t have to audit this by hand.
Two tools point at an image and list known problems by severity:
# Docker's built-in scanner.
docker scout cves my-app:latest
# Trivy, a popular open-source scanner.
trivy image my-app:latest
Read the report top down. Critical and high findings in something you actually use at runtime are worth acting on, usually by bumping the offending package or rebasing onto a newer base image. A pile of low-severity notes in a transitive dependency you never call is good to know about but rarely an emergency. Scanning won’t make the image safe, it just tells you what you’re shipping so “I had no idea” stops being the reason something got popped.
Why npm run dev is not a production command
This deserves its own section because it’s the single most common mistake, and it’s an easy one to make: the command that’s worked for you on day one is right there, so people ship it.
npm run dev starts the dev server, and the dev server is built to make your life pleasant while you write code, not to serve strangers at scale. Concretely:
- It’s unoptimized. No production build, no minification, no tree-shaking. It serves source, not the lean compiled output.
- It watches your files for changes and rebuilds on save. In production nothing is changing, so that’s pure wasted CPU and memory sitting there waiting.
- It’s often single-process and tuned for one developer’s traffic, not many concurrent users.
- It can leak debug behavior: verbose errors, stack traces, source maps, and dev-only routes that you do not want pointed at the public internet.
Production runs the built app. For a plain Node and Express service that’s node dist/server.js. For something with a framework, it’s the framework’s production start, like npm start pointed at the build output, or node ./dist/server/entry.mjs for an Astro app with a server adapter. The shape is always the same: build once, then run the result. Don’t make production rebuild your app on the fly every time it starts.
Common mistakes
The same handful of slip-ups account for most “it worked in dev” production fires.
| Mistake | What goes wrong | The fix |
|---|---|---|
| Shipping the dev image | Bloated, runs the dev server, leaks debug behavior | Multi-stage build, run the built app |
| Running as root | A compromise lands with admin privileges | Create a user, USER app before CMD |
| Secrets baked into the image | The key ships inside the image for anyone to read | Inject at runtime via the platform’s secret store |
npm run dev in prod | Unoptimized, file-watching, single-process | node dist/server.js or the framework’s start |
| No restart policy | A 2am crash stays down until you notice | restart: unless-stopped or a managed host |
| Giant images | Slow pulls, slow deploys, more to scan | Alpine or slim base, drop dev dependencies |
Tying it back to shipping something real
Picture the project you’d actually deploy: an Express API talking to PostgreSQL and Redis, maybe an Astro front end. The dev setup from Docker without the buzzwords, one Compose file that brings up the app, the database, and the cache, is still exactly how you should work locally. Nothing here replaces that. Local dev wants convenience, and that guide gives it to you.
Production is the same app with the goals flipped. Build it in two stages so the image is small and clean. Run it as a non-root user. Give it a health check so the platform knows the difference between running and working, and a restart policy so a crash comes back. Let logs flow to stdout. Hand secrets in at runtime and keep them out of the image. Scan before you trust. Run the built app, never the dev server.
Push that image to GitHub, point Railway or Render at it (or build straight from your repo and let them handle the restart policy and secrets for you), and you have a container that’s safe to leave running while you sleep. That’s the whole difference between a container that runs and a container you can ship.