openai-serve (OpenAI-compatible gateway)

openai-serve is a small HTTP gateway that implements a subset of the OpenAI API and forwards requests to a local lcpd-grpcd (Requester) over gRPC. The Requester handles Lightning connectivity, quote/payment, and result retrieval. This lets existing OpenAI-compatible clients (SDKs, LangChain, curl, etc.) send requests over LCP by mostly changing the base_url.

Architecture

OpenAI SDK / curl
   |
   |  HTTP (OpenAI-compatible)
   v
openai-serve
   |
   |  gRPC (LCPDService)
   v
lcpd-grpcd (Requester)  --- Lightning --- Provider peer (lcpd-grpcd Provider)

Notes:

openai-serve is intentionally stateless and does not connect to Lightning directly.
The Requester (lcpd-grpcd) is the component that can spend sats.

Logging and privacy

Logs are treated as sensitive. openai-serve is designed to be diagnosable without persisting raw user content.

Logs MUST NOT contain raw prompts (messages[].content) or raw model outputs.
Logs include only operational metadata (model/peer/job ids, price, timings, and byte/token counts).
OPENAI_SERVE_LOG_LEVEL=debug enables more verbose request logging; keep info (default) for production unless needed.

Supported endpoints

POST /v1/chat/completions (JSON or stream:true SSE passthrough)
POST /v1/responses (JSON or stream:true SSE passthrough)
GET /v1/models
GET /healthz

Quickstart

Prerequisite: run lcpd-grpcd separately (Requester mode) and ensure it can reach your Lightning node and LCP peers. Build:

cd apps/openai-serve
go install ./cmd/openai-serve

Run:

export OPENAI_SERVE_HTTP_ADDR="127.0.0.1:8080"
export OPENAI_SERVE_LCPD_GRPC_ADDR="127.0.0.1:50051"

# Optional auth (comma-separated)
export OPENAI_SERVE_API_KEYS="devkey1"

openai-serve

Test:

curl -sS http://127.0.0.1:8080/v1/chat/completions \
  -H 'content-type: application/json' \
  -H 'authorization: Bearer devkey1' \
  -d '{"model":"gpt-5.2","messages":[{"role":"user","content":"Say hello."}]}'

Streaming (Chat Completions):

curl -N http://127.0.0.1:8080/v1/chat/completions \
  -H 'content-type: application/json' \
  -H 'authorization: Bearer devkey1' \
  -d '{"model":"gpt-5.2","stream":true,"messages":[{"role":"user","content":"Say hello."}]}'

Streaming (Responses):

curl -N http://127.0.0.1:8080/v1/responses \
  -H 'content-type: application/json' \
  -H 'authorization: Bearer devkey1' \
  -d '{"model":"gpt-5.2","stream":true,"input":"Say hello."}'

Using from CLI tools

openai-serve speaks the OpenAI Chat Completions and Responses APIs (including stream:true). Many OpenAI-compatible CLI tools can be pointed at it by configuring a custom base_url and API key.

Codex CLI

Configure Codex CLI to use openai-serve as a custom model provider (Chat Completions wire format):

Add this to ~/.codex/config.toml:

[model_providers.openai_serve]
name = "openai-serve"
base_url = "http://127.0.0.1:8080/v1"
env_key = "OPENAI_SERVE_API_KEY"
wire_api = "chat"

[profiles.lcp]
model = "gpt-5.2"
model_provider = "openai_serve"

Run (API key must match what openai-serve expects):

export OPENAI_SERVE_API_KEY="devkey1"
codex --profile lcp "Say hello in Japanese."

SSE streaming uses stream:true and is passed through byte-for-byte; field support depends on the selected Provider.

LLM (`llm`)

llm can use an OpenAI-compatible base URL by defining an extra model. Create an extra-openai-models.yaml file in the llm user directory and add:

- model_id: lcp-gpt-5.2
  model_name: gpt-5.2
  api_base: http://127.0.0.1:8080/v1
  can_stream: true
  api_key_name: openai-serve

Then:

llm keys set openai-serve
llm -m lcp-gpt-5.2 "Say hello in Japanese."

Request/response behavior

openai-serve is a passthrough gateway for Chat Completions and Responses request/response bytes.
Minimal validation is applied before routing:
- Body must be a valid JSON object.
- model must be present and non-empty (no leading/trailing whitespace).
- messages (chat completions) or input (responses) must be present and non-empty.
- HTTP Content-Encoding must be omitted or identity (compressed request bodies are rejected).
stream:true delivers text/event-stream bytes as they arrive. Without stream (or false), the full JSON response body is returned.
Request body size is limited to 1 MiB.
Provider result bytes are returned as-is; HTTP Content-Type/Content-Encoding are taken from the LCP result metadata.

For configuration, routing rules, and safety knobs (max price, timeouts, model allowlists), see Configuration.

LCP metadata headers

POST /v1/chat/completions and POST /v1/responses responses include:

X-Lcp-Peer-Id: chosen Provider peer id
X-Lcp-Job-Id: job id (hex)
X-Lcp-Price-Msat: accepted quote price
X-Lcp-Terms-Hash: accepted quote terms hash (hex)

Getting Started

Guides

Integrations

Protocol

Implementation

Architecture

Logging and privacy

Supported endpoints

Quickstart

Using from CLI tools

Codex CLI

LLM (`llm`)

Request/response behavior

LCP metadata headers

Getting Started

Guides

Integrations

Protocol

Implementation

​Architecture

​Logging and privacy

​Supported endpoints

​Quickstart

​Using from CLI tools

​Codex CLI

​LLM (llm)

​Request/response behavior

​LCP metadata headers

Architecture

Logging and privacy

Supported endpoints

Quickstart

Using from CLI tools

Codex CLI

LLM (`llm`)

Request/response behavior

LCP metadata headers