Skip to main content
openai-serve is a small HTTP gateway that implements a subset of the OpenAI API and forwards requests to a local lcpd-grpcd (Requester) over gRPC. The Requester handles Lightning connectivity, quote/payment, and result retrieval. This lets existing OpenAI-compatible clients (SDKs, LangChain, curl, etc.) send requests over LCP by mostly changing the base_url.

Architecture

OpenAI SDK / curl
   |
   |  HTTP (OpenAI-compatible)
   v
openai-serve
   |
   |  gRPC (LCPDService)
   v
lcpd-grpcd (Requester)  --- Lightning --- Provider peer (lcpd-grpcd Provider)
Notes:
  • openai-serve is intentionally stateless and does not connect to Lightning directly.
  • The Requester (lcpd-grpcd) is the component that can spend sats.

Logging and privacy

Logs are treated as sensitive. openai-serve is designed to be diagnosable without persisting raw user content.
  • Logs MUST NOT contain raw prompts (messages[].content) or raw model outputs.
  • Logs include only operational metadata (model/peer/job ids, price, timings, and byte/token counts).
  • OPENAI_SERVE_LOG_LEVEL=debug enables more verbose request logging; keep info (default) for production unless needed.

Supported endpoints

  • POST /v1/chat/completions (JSON or stream:true SSE passthrough)
  • POST /v1/responses (JSON or stream:true SSE passthrough)
  • GET /v1/models
  • GET /healthz

Quickstart

Prerequisite: run lcpd-grpcd separately (Requester mode) and ensure it can reach your Lightning node and LCP peers. Build:
cd apps/openai-serve
go install ./cmd/openai-serve
Run:
export OPENAI_SERVE_HTTP_ADDR="127.0.0.1:8080"
export OPENAI_SERVE_LCPD_GRPC_ADDR="127.0.0.1:50051"

# Optional auth (comma-separated)
export OPENAI_SERVE_API_KEYS="devkey1"

openai-serve
Test:
curl -sS http://127.0.0.1:8080/v1/chat/completions \
  -H 'content-type: application/json' \
  -H 'authorization: Bearer devkey1' \
  -d '{"model":"gpt-5.2","messages":[{"role":"user","content":"Say hello."}]}'
Streaming (Chat Completions):
curl -N http://127.0.0.1:8080/v1/chat/completions \
  -H 'content-type: application/json' \
  -H 'authorization: Bearer devkey1' \
  -d '{"model":"gpt-5.2","stream":true,"messages":[{"role":"user","content":"Say hello."}]}'
Streaming (Responses):
curl -N http://127.0.0.1:8080/v1/responses \
  -H 'content-type: application/json' \
  -H 'authorization: Bearer devkey1' \
  -d '{"model":"gpt-5.2","stream":true,"input":"Say hello."}'

Using from CLI tools

openai-serve speaks the OpenAI Chat Completions and Responses APIs (including stream:true). Many OpenAI-compatible CLI tools can be pointed at it by configuring a custom base_url and API key.

Codex CLI

Configure Codex CLI to use openai-serve as a custom model provider (Chat Completions wire format):
  1. Add this to ~/.codex/config.toml:
[model_providers.openai_serve]
name = "openai-serve"
base_url = "http://127.0.0.1:8080/v1"
env_key = "OPENAI_SERVE_API_KEY"
wire_api = "chat"

[profiles.lcp]
model = "gpt-5.2"
model_provider = "openai_serve"
  1. Run (API key must match what openai-serve expects):
export OPENAI_SERVE_API_KEY="devkey1"
codex --profile lcp "Say hello in Japanese."
SSE streaming uses stream:true and is passed through byte-for-byte; field support depends on the selected Provider.

LLM (llm)

llm can use an OpenAI-compatible base URL by defining an extra model. Create an extra-openai-models.yaml file in the llm user directory and add:
- model_id: lcp-gpt-5.2
  model_name: gpt-5.2
  api_base: http://127.0.0.1:8080/v1
  can_stream: true
  api_key_name: openai-serve
Then:
llm keys set openai-serve
llm -m lcp-gpt-5.2 "Say hello in Japanese."

Request/response behavior

  • openai-serve is a passthrough gateway for Chat Completions and Responses request/response bytes.
  • Minimal validation is applied before routing:
    • Body must be a valid JSON object.
    • model must be present and non-empty (no leading/trailing whitespace).
    • messages (chat completions) or input (responses) must be present and non-empty.
    • HTTP Content-Encoding must be omitted or identity (compressed request bodies are rejected).
  • stream:true delivers text/event-stream bytes as they arrive. Without stream (or false), the full JSON response body is returned.
  • Request body size is limited to 1 MiB.
  • Provider result bytes are returned as-is; HTTP Content-Type/Content-Encoding are taken from the LCP result metadata.
For configuration, routing rules, and safety knobs (max price, timeouts, model allowlists), see Configuration.

LCP metadata headers

POST /v1/chat/completions and POST /v1/responses responses include:
  • X-Lcp-Peer-Id: chosen Provider peer id
  • X-Lcp-Job-Id: job id (hex)
  • X-Lcp-Price-Msat: accepted quote price
  • X-Lcp-Terms-Hash: accepted quote terms hash (hex)