Prepaid credits · token-metered · OpenAI-compatible

Predictable GPU inference spend — without surprise bills.

Darktree provides OpenAI-compatible inference endpoints backed by prepaid compute credits. Credits are consumed by token usage (prompt_tokens + completion_tokens) returned in API responses. Add daily caps for budget control and an append-only audit ledger for reconciliation.

Request access View bundles Copy quickstart

OpenAI-style /v1 endpoints Token usage in every response Daily caps (HTTP 429 on limit) API key auth Append-only usage ledger

1) Buy prepaid credits

Purchase a bundle via Stripe. Credits fund token-metered inference and help you lock your spend up front.

2) Send requests

Use OpenAI-compatible endpoints with standard headers. Most clients work with minimal changes.

3) Cap, meter, audit

Token usage is logged per request. Daily caps protect against runaway usage; the ledger supports reconciliation.

What you get

Cost certainty: prepaid bundles + customer-level daily caps
Auditability: append-only per-request usage ledger
Compatibility: OpenAI-style endpoints and usage fields
Support: direct operator help during onboarding

What credits are (and aren’t)

Credits are a prepaid balance consumed by token usage
Credits are not GPU hours, server rentals, or reserved hardware
Token usage returned by the API response is the billing source of truth

Larger models and heavier workloads may consume credits faster. The API’s token counts remain authoritative.

Stripe bundles · prepaid compute credits Payments: Checking…

Plans & Credits

Prepaid bundles fund token-metered inference. Each plan includes a conservative daily cap to prevent runaway spend. Caps reset daily and can be raised or lowered on request.

	Solo	Team	Scale
Bundle price	$50	$150	$300
Tokens included	~66,667	~200,000	~400,000
Default daily cap	2,000 tokens/day	7,000 tokens/day	15,000 tokens/day
~Requests/day (example) Assumes ~800 tokens/request	~2–3	~8–10	~18–20
Max spend/day at cap $0.75 / 1k tokens	$1.50/day	$5.25/day	$11.25/day

Token totals assume $0.75 per 1,000 tokens. Daily caps reset at 00:00 UTC and act as a hard stop (requests return HTTP 429 when exceeded) until the next reset. Caps are set per customer by Darktree and can be adjusted on request.

Latency note: Premium models are typically higher‑latency than Standard models. In steady state, qwen25-14b-awq is commonly ~100 tokens/sec and qwen25-32b-awq ~45 tokens/sec (typical medians; depends on prompt length, max_tokens, concurrency, and warm vs cold starts).

Budgeting Guide · Usage & Billing PDF

Note: credits are prepaid and non‑refundable. Usage is measured in tokens, not time. Token usage returned by API responses is the authoritative record for credit deduction.

Quickstart

Copy the snippet below, replace <YOUR_API_KEY> with the key you receive after purchasing a bundle, and start calling the OpenAI‑compatible endpoint.

# Example with curl
curl https://api.darktree.us/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_API_KEY>" \
  -d '{
        "model": "qwen25-14b-awq",
        "messages": [{"role":"user","content":"Hello, world!"}],
        "max_tokens": 128
      }'

The response will include prompt_tokens, completion_tokens and total_tokens – those values are what Darktree uses to deduct from your prepaid balance.

Headers & limits

Auth: Authorization: Bearer <API_KEY>
Customer tag: X-Customer-Id: <your-id>
Caps: enforced per customer per day (HTTP 429)
Billing source: token usage + audit ledger

Models: qwen25-14b-awq (Standard) · qwen25-32b-awq (Premium)

Need higher caps or a dedicated lane? Email expected tokens/day and latency needs.

Bookkeeping · $50/hr

Need help keeping Stripe + usage ledger + QuickBooks clean? Book hourly bookkeeping / ops support.