PM2 Overview

PM2 is used as the process supervisor for long-running Node.js services in this environment.

It is responsible for:

keeping services alive
restarting on crash
managing multiple frontend and backend runners
exposing basic runtime metrics

PM2 is not a deployment tool. Builds and configuration happen outside PM2.

When PM2 Is Used

PM2 is used for:

Next.js production servers
background Node services
long-running automation processes

PM2 is not used for:

Dockerized services
short-lived scripts
cron-style jobs

Core Concepts

Process: A single running command
App name: Logical identifier in PM2
Fork mode: One Node process per app (default here)
Restart count: Number of crashes / restarts

Basic Commands

List processes


pm2 list

Shows all registered processes, status, memory, and restart count.

View logs


pm2 logs
pm2 logs <name|id>

Inspect a process


pm2 describe <name|id>

Restart a process


pm2 restart <name|id>

Stop a process


pm2 stop <name|id>

Remove a process (unregister)


pm2 delete <name|id>

Starting New PM2 Runners

Start a Next.js app (explicit port)


pm2 start pnpm --name chat-frontend -- start --port 3001

Pattern:

pnpm is the binary PM2 tracks
--name sets a stable identifier
everything after -- is passed to pnpm

Start a generic Node service


pm2 start index.js --name seek-applier

Start from a working directory


pm2 start pnpm \
  --name mars-frontend \
  --cwd /srv/mars/frontend \
  -- start --port 3000

Use --cwd when running outside the current directory.

Port Allocation (Current)

This is the expected port layout. NGINX routes external traffic to these.

Service	PM2 Name	Port
Main frontend	mars-frontend	3000
Chat frontend	chat-frontend	3001
Chat embed / overlays	chat-frontend	3001
Admin / misc frontend	pnpm start —port…	2001
Automation service	seek-applier	n/a (no HTTP)

Ports are not auto-assigned. Conflicts must be resolved manually.

Current PM2 State (Example)


┌────┬────────────────────┬──────────┬──────┬───────────┬──────────┬──────────┐
│ id │ name               │ mode     │ ↺    │ status    │ cpu      │ memory   │
├────┼────────────────────┼──────────┼──────┼───────────┼──────────┼──────────┤
│ 3  │ chat-frontend      │ fork     │ 16   │ online    │ 0%       │ 61.9mb   │
│ 6  │ mars-frontend      │ fork     │ 217  │ online    │ 0%       │ 95.7mb   │
│ 4  │ pnpm start --port… │ fork     │ 28   │ online    │ 0%       │ 83.6mb   │
│ 0  │ seek-applier       │ fork     │ 5    │ online    │ 0%       │ 140.8mb  │
│ 2  │ chat-frontend      │ fork     │ 0    │ stopped   │ 0%       │ 0b       │
│ 5  │ pnpm start --port… │ fork     │ 964  │ stopped   │ 0%       │ 0b       │
│ 1  │ pnpm start -p 2001 │ fork     │ 15   │ errored   │ 0%       │ 0b       │
└────┴────────────────────┴──────────┴──────┴───────────┴──────────┴──────────┘

Notes:

Duplicate names indicate multiple historical runners
High restart counts indicate instability
errored requires log inspection before restart

Failure Handling Rules

Restart loops must be investigated, not ignored
Errored processes should be stopped before debugging
Duplicate runners should be deleted once obsolete

Example cleanup:


pm2 delete 2
pm2 delete 5

Persistence Across Reboot

PM2 startup persistence must be enabled explicitly:


pm2 startup
pm2 save

If not saved, processes will not restart on reboot.

Operational Notes

PM2 does not manage environment variables centrally
Port binding errors will surface as errored
Logs rotate unless configured otherwise
PM2 memory numbers are per-process, not total usage

Non-Goals

This document does not cover:

PM2 ecosystem files
Docker integration
CI/CD automation
Load balancing or clustering

Those are documented elsewhere.

PM2 Ecosystem Files

PM2 ecosystem files define declarative process configuration and are used to standardize how services are run.

They are preferred over ad-hoc pm2 start commands once a service stabilizes.

Purpose

Ecosystem files are used to:

codify service names and commands
define ports and environments
ensure consistent restarts
support CI/CD-driven restarts

Example: `ecosystem.config.js`


module.exports = {
  apps: [
    {
      name: "chat-frontend",
      script: "pnpm",
      args: "start --port 3001",
      cwd: "/srv/mars/chat-frontend",
      exec_mode: "fork",
      env: {
        NODE_ENV: "production",
      },
    },
    {
      name: "mars-frontend",
      script: "pnpm",
      args: "start --port 3000",
      cwd: "/srv/mars/frontend",
      exec_mode: "fork",
    },
  ],
};

Running from ecosystem


pm2 start ecosystem.config.js
pm2 reload ecosystem.config.js

Rules

One ecosystem file per logical host
Explicit ports only
No secrets committed to ecosystem files
Environment variables come from the host or CI

CI/CD Automation (PM2)

PM2 is restart infrastructure, not a deployment system.

CI/CD is responsible for:

building artifacts
syncing files to the server
triggering PM2 reloads

Typical Flow


CI Pipeline
  → Build
  → Deploy files
  → pm2 reload

PM2 does not:

pull code
build assets
manage versions

Example: CI deploy step


ssh mars-server <<'EOF'
  cd /srv/mars
  pnpm install --prod
  pnpm build
  pm2 reload ecosystem.config.js
EOF

Reload vs Restart

Command	Behavior
`pm2 restart`	Hard stop + start
`pm2 reload`	Zero-downtime (when supported)

For Next.js in fork mode, reload ≈ restart.

CI Rules

Always reload via ecosystem file
Never pm2 delete from CI
CI must be idempotent
Failed builds must not reload PM2

Load Balancing & Clustering

PM2 supports Node-level clustering, but this environment uses external load balancing instead.

PM2 Cluster Mode (Not Used)


exec_mode: "cluster",
instances: 4

This:

spawns multiple Node workers
load balances via Node IPC
shares the same port

Why Cluster Mode Is Avoided Here

Next.js already handles concurrency well
WebSockets complicate clustering
NGINX provides better observability
External load balancing scales better

Actual Model Used


Client
  → NGINX
    → Single PM2 process per service

Scaling is achieved by:

running multiple hosts
load balancing at NGINX / CDN level

WebSocket Considerations

For chat services:

stickiness is required
Redis is used for fanout
PM2 clustering adds complexity with no benefit

Rule

PM2 is for process supervision, not horizontal scaling.

Summary

Concern	Handled By
Process lifecycle	PM2
Configuration	Ecosystem files
Deployment	CI/CD
Load balancing	NGINX / CDN
Scaling	Multiple hosts

Non-Goals (Explicit)

PM2 is not responsible for:

auto-scaling
traffic routing
secret management
deployments
canary releases

Those concerns live elsewhere.