API Reference

The ty\e API

Send messages, get streaming engineering responses. Bearer auth, Server-Sent Events, rate-limited per plan. Available on Pro and Max.

Overview

The ty\e API exposes a single endpoint that mirrors the conversational interface available in the web app. You send a list of messages, the server replies with an event-stream of text chunks plus a final usage summary.

POST https://tylellm.com/chat
  • Request body: application/json
  • Response: text/event-stream (SSE)
  • Authentication: Authorization: Bearer gx_...
  • Available on plans: Pro · Max
Pro / Max only. API keys are issued from Settings → API after you upgrade. Free plan accounts cannot issue keys.

Authentication

Every request must include a Bearer token in the Authorization header. Keys are 56-character hex strings prefixed with gx_.

Authorization: Bearer gx_a4f8e2b1c9d7f3e5a2c8b1d4e6f9c3a7b061e2345678901abcdef23

Get a key

  1. Sign in at tylellm.com/app
  2. Upgrade to Pro or Max
  3. Settings → API → Show or Copy

Rotate a key

If a key leaks, click Regenerate in the API panel. The old key is invalidated immediately. There's a hard rate-limit of 3 regenerations per 5 minutes per user.

Never commit keys to a repo. Use environment variables. If your plan lapses to Free, the key stops working until you re-upgrade.

Models

ty\e routes your request to the right model based on your plan. You don't need to specify a model in the request — the server picks the best one available to your account.

ModelPlanBest forSpeed
ty\e Fast (Engineering Depth) Free, Pro, Max Everyday engineering Q&A, calculations, CAD analysis, code generation. Hosted on L40S — solid 32B reasoning with snappy responses. Fast
ty\e Pro (Priority Queue) Pro, Max Same model as Fast, plus priority queue and 10× higher daily message quota. Best for active project work with vision + audio attachments. Priority
ty\e VA (Apex Complexity) Max Native ty\e engine — designed for highest-complexity multi-modal engineering tasks. Self-learning architecture. In active development — Q3 launch. In dev
ty\e Enterprise (Custom fine-tune) Organization Your own data, your own model — fine-tuned on internal CAD libraries, standards, and historical drawings. Deployed on-prem or private cloud. Custom
The model field in the request body is currently ignored — the highest-tier model your plan grants is selected automatically. Explicit model pinning is coming in v1.5.

Engineer roadmap

Where each ty\e tier is heading. Dates are best-effort and may shift as model training completes.

TierNowNextETA
ty\e FastQwen 32B-AWQ on L40SQLoRA fine-tune on CAD schema (33K samples) for native <cad> outputQ3 2026
ty\e ProSame as Fast + priority queueConcurrent multi-modal sessions, vision-heavy attachmentsQ3 2026
ty\e VAtyeng-VA 161M (in dev)Scale to 1.3B + 7B native ty\e engine, self-learning loopQ3 2026
ty\e EnterpriseCustom fine-tuneSelf-serve LoRA portal, on-prem imagesQ1 2027

POST /chat

Send a list of messages, receive a streaming reply.

Request body

FieldTypeDescription
messagesarrayConversation so far. Each item: {role, content}. Required.
streambooleantrue for SSE (recommended). false returns full response at end. Default true.
temperaturenumber0.0–1.0. Lower = more deterministic. Default 0.4.
top_pnumberNucleus sampling. Default 0.85.
max_tokensintegerCap on reply length. Default 384 (auto-bumped for CAD/drawing prompts).

Message roles

  • user — your prompt
  • assistant — previous AI reply (for multi-turn context)
  • system — optional system override (rarely needed; Custom Instructions handle this server-side)

Example request

{
  "messages": [
    { "role": "user", "content": "Calculate head loss in 50m of DN100 pipe at 5 L/s" }
  ],
  "stream": true,
  "temperature": 0.3
}

Streaming format

When stream: true, the server returns text/event-stream. Each line follows SSE convention: data: <json>\n\n. Read line by line and decode the JSON.

data: {"delta": "Using"}
data: {"delta": " Darcy-Weisbach"}
data: {"delta": " with"}
…
data: {"usage": {"prompt_tokens": 42, "completion_tokens": 218}}
data: [DONE]

Concatenate every delta in order. The final non-DONE event carries the usage object. After [DONE], close the stream.

Errors

Errors come back as standard HTTP status codes with a JSON body.

StatusReasonAction
401Missing / invalid API keyCheck Authorization header. Confirm key not regenerated.
403Plan downgraded to FreeRe-upgrade to Pro or Max to restore API access.
429Rate limit exceededBack off. Retry-After header tells you when.
500Internal errorTransient — retry with exponential backoff. Contact support if persistent.
{ "detail": "Invalid API key" }

Rate limits

Limits are per-account, not per-key. Regenerating a key does not reset the daily counter.

PlanMessages / dayBurst
Pro1,00060 / minute
MaxUnlimited120 / minute

When you hit a limit the server returns 429 with a Retry-After header indicating seconds to wait. Implement exponential backoff in production clients.

Code samples

# Streaming chat completion
curl -N -X POST https://tylellm.com/chat \
  -H "Authorization: Bearer $TYLELLM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Calculate head loss in 50m of DN100 pipe at 5 L/s"}
    ],
    "stream": true
  }'

Ready to ship?

Upgrade to Pro to issue your first API key — keep your existing chat history, sync across devices, and unlock advanced models.

Get started