API Reference

The ty\e API

Send messages, get streaming engineering responses. Bearer auth, Server-Sent Events, rate-limited per plan. Available on Pro and Max.

Overview

The ty\e API exposes a single endpoint that mirrors the conversational interface available in the web app. You send a list of messages, the server replies with an event-stream of text chunks plus a final usage summary.

POST https://tylellm.com/chat

Request body: application/json
Response: text/event-stream (SSE)
Authentication: Authorization: Bearer gx_...
Available on plans: Pro · Max

Pro / Max only. API keys are issued from Settings → API after you upgrade. Free plan accounts cannot issue keys.

Authentication

Every request must include a Bearer token in the Authorization header. Keys are 56-character hex strings prefixed with gx_.

Authorization: Bearer gx_a4f8e2b1c9d7f3e5a2c8b1d4e6f9c3a7b061e2345678901abcdef23

Get a key

Sign in at tylellm.com/app
Upgrade to Pro or Max
Settings → API → Show or Copy

Rotate a key

If a key leaks, click Regenerate in the API panel. The old key is invalidated immediately. There's a hard rate-limit of 3 regenerations per 5 minutes per user.

Never commit keys to a repo. Use environment variables. If your plan lapses to Free, the key stops working until you re-upgrade.

Models

ty\e routes your request to the right model based on your plan. You don't need to specify a model in the request — the server picks the best one available to your account.

Model	Plan	Best for	Speed
ty\e Fast (Engineering Depth)	Free, Pro, Max	Everyday engineering Q&A, calculations, CAD analysis, code generation. Hosted on L40S — solid 32B reasoning with snappy responses.	Fast
ty\e Pro (Priority Queue)	Pro, Max	Same model as Fast, plus priority queue and 10× higher daily message quota. Best for active project work with vision + audio attachments.	Priority
ty\e VA (Apex Complexity)	Max	Native ty\e engine — designed for highest-complexity multi-modal engineering tasks. Self-learning architecture. In active development — Q3 launch.	In dev
ty\e Enterprise (Custom fine-tune)	Organization	Your own data, your own model — fine-tuned on internal CAD libraries, standards, and historical drawings. Deployed on-prem or private cloud.	Custom

The model field in the request body is currently ignored — the highest-tier model your plan grants is selected automatically. Explicit model pinning is coming in v1.5.

Engineer roadmap

Where each ty\e tier is heading. Dates are best-effort and may shift as model training completes.

Tier	Now	Next	ETA
ty\e Fast	Qwen 32B-AWQ on L40S	QLoRA fine-tune on CAD schema (33K samples) for native <cad> output	Q3 2026
ty\e Pro	Same as Fast + priority queue	Concurrent multi-modal sessions, vision-heavy attachments	Q3 2026
ty\e VA	tyeng-VA 161M (in dev)	Scale to 1.3B + 7B native ty\e engine, self-learning loop	Q3 2026
ty\e Enterprise	Custom fine-tune	Self-serve LoRA portal, on-prem images	Q1 2027

POST /chat

Send a list of messages, receive a streaming reply.

Request body

Field	Type	Description
messages	array	Conversation so far. Each item: `{role, content}`. Required.
stream	boolean	`true` for SSE (recommended). `false` returns full response at end. Default `true`.
temperature	number	0.0–1.0. Lower = more deterministic. Default `0.4`.
top_p	number	Nucleus sampling. Default `0.85`.
max_tokens	integer	Cap on reply length. Default `384` (auto-bumped for CAD/drawing prompts).

Message roles

user — your prompt
assistant — previous AI reply (for multi-turn context)
system — optional system override (rarely needed; Custom Instructions handle this server-side)

Example request

{
  "messages": [
    { "role": "user", "content": "Calculate head loss in 50m of DN100 pipe at 5 L/s" }
  ],
  "stream": true,
  "temperature": 0.3
}

Streaming format

When stream: true, the server returns text/event-stream. Each line follows SSE convention: data: <json>\n\n. Read line by line and decode the JSON.

data: {"delta": "Using"}
data: {"delta": " Darcy-Weisbach"}
data: {"delta": " with"}
…
data: {"usage": {"prompt_tokens": 42, "completion_tokens": 218}}
data: [DONE]

Concatenate every delta in order. The final non-DONE event carries the usage object. After [DONE], close the stream.

Errors

Errors come back as standard HTTP status codes with a JSON body.

Status	Reason	Action
401	Missing / invalid API key	Check `Authorization` header. Confirm key not regenerated.
403	Plan downgraded to Free	Re-upgrade to Pro or Max to restore API access.
429	Rate limit exceeded	Back off. Retry-After header tells you when.
500	Internal error	Transient — retry with exponential backoff. Contact support if persistent.

{ "detail": "Invalid API key" }

Rate limits

Limits are per-account, not per-key. Regenerating a key does not reset the daily counter.

Plan	Messages / day	Burst
Pro	1,000	60 / minute
Max	Unlimited	120 / minute

When you hit a limit the server returns 429 with a Retry-After header indicating seconds to wait. Implement exponential backoff in production clients.

Code samples

# Streaming chat completion
curl -N -X POST https://tylellm.com/chat \
  -H "Authorization: Bearer $TYLELLM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Calculate head loss in 50m of DN100 pipe at 5 L/s"}
    ],
    "stream": true
  }'

import os
import requests

# Streaming chat completion
with requests.post(
    "https://tylellm.com/chat",
    headers={
        "Authorization": f"Bearer {os.environ['TYLELLM_API_KEY']}",
        "Content-Type":  "application/json",
    },
    json={
        "messages": [{"role": "user", "content": "Calculate head loss in 50m of DN100 pipe at 5 L/s"}],
        "stream":   True,
    },
    stream=True,
) as r:
    for line in r.iter_lines():
        if line and line.startswith(b"data: "):
            print(line[6:].decode(), flush=True)

import fetch from 'node-fetch';

const res = await fetch('https://tylellm.com/chat', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.TYLELLM_API_KEY}`,
    'Content-Type':  'application/json',
  },
  body: JSON.stringify({
    messages: [{ role: 'user', content: 'Calculate head loss in 50m of DN100 pipe at 5 L/s' }],
    stream:   true,
  }),
});

const reader = res.body.getReader();
const decoder = new TextDecoder();
while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  process.stdout.write(decoder.decode(value));
}

package main

import (
    "bufio"
    "bytes"
    "fmt"
    "net/http"
    "os"
)

func main() {
    body := bytes.NewBufferString(`{
        "messages": [{"role":"user","content":"Calculate head loss in 50m of DN100 pipe at 5 L/s"}],
        "stream": true
    }`)
    req, _ := http.NewRequest("POST", "https://tylellm.com/chat", body)
    req.Header.Set("Authorization", "Bearer "+os.Getenv("TYLELLM_API_KEY"))
    req.Header.Set("Content-Type", "application/json")

    res, _ := http.DefaultClient.Do(req)
    defer res.Body.Close()

    sc := bufio.NewScanner(res.Body)
    for sc.Scan() {
        fmt.Println(sc.Text())
    }
}

Ready to ship?

Upgrade to Pro to issue your first API key — keep your existing chat history, sync across devices, and unlock advanced models.

Get started