openai-serve is a small HTTP gateway that implements a subset of the OpenAI API and forwards requests to a local
lcpd-grpcd (Requester) over gRPC. The Requester handles Lightning connectivity, quote/payment, and result retrieval.
This lets existing OpenAI-compatible clients (SDKs, LangChain, curl, etc.) send requests over LCP by mostly changing the
base_url.
Architecture
openai-serveis intentionally stateless and does not connect to Lightning directly.- The Requester (
lcpd-grpcd) is the component that can spend sats.
Logging and privacy
Logs are treated as sensitive.openai-serve is designed to be diagnosable without persisting raw user content.
- Logs MUST NOT contain raw prompts (
messages[].content) or raw model outputs. - Logs include only operational metadata (model/peer/job ids, price, timings, and byte/token counts).
OPENAI_SERVE_LOG_LEVEL=debugenables more verbose request logging; keepinfo(default) for production unless needed.
Supported endpoints
POST /v1/chat/completions(JSON orstream:trueSSE passthrough)POST /v1/responses(JSON orstream:trueSSE passthrough)GET /v1/modelsGET /healthz
Quickstart
Prerequisite: runlcpd-grpcd separately (Requester mode) and ensure it can reach your Lightning node and LCP peers.
Build:
Using from CLI tools
openai-serve speaks the OpenAI Chat Completions and Responses APIs (including stream:true). Many
OpenAI-compatible CLI tools can be pointed at it by configuring a custom base_url and API key.
Codex CLI
Configure Codex CLI to useopenai-serve as a custom model provider (Chat Completions wire format):
- Add this to
~/.codex/config.toml:
- Run (API key must match what
openai-serveexpects):
stream:true and is passed through byte-for-byte; field support depends on the selected Provider.
LLM (llm)
llm can use an OpenAI-compatible base URL by defining an extra model. Create an extra-openai-models.yaml file in the
llm user directory and add:
Request/response behavior
openai-serveis a passthrough gateway for Chat Completions and Responses request/response bytes.- Minimal validation is applied before routing:
- Body must be a valid JSON object.
modelmust be present and non-empty (no leading/trailing whitespace).messages(chat completions) orinput(responses) must be present and non-empty.- HTTP
Content-Encodingmust be omitted oridentity(compressed request bodies are rejected).
stream:truedeliverstext/event-streambytes as they arrive. Withoutstream(orfalse), the full JSON response body is returned.- Request body size is limited to 1 MiB.
- Provider result bytes are returned as-is; HTTP
Content-Type/Content-Encodingare taken from the LCP result metadata.
LCP metadata headers
POST /v1/chat/completions and POST /v1/responses responses include:
X-Lcp-Peer-Id: chosen Provider peer idX-Lcp-Job-Id: job id (hex)X-Lcp-Price-Msat: accepted quote priceX-Lcp-Terms-Hash: accepted quote terms hash (hex)