The ty\e API
Send messages, get streaming engineering responses. Bearer auth, Server-Sent Events, rate-limited per plan. Available on Pro and Max.
Overview
The ty\e API exposes a single endpoint that mirrors the conversational interface available in the web app. You send a list of messages, the server replies with an event-stream of text chunks plus a final usage summary.
- Request body:
application/json - Response:
text/event-stream(SSE) - Authentication:
Authorization: Bearer gx_... - Available on plans: Pro · Max
Authentication
Every request must include a Bearer token in the Authorization header. Keys are 56-character hex strings prefixed with gx_.
Authorization: Bearer gx_a4f8e2b1c9d7f3e5a2c8b1d4e6f9c3a7b061e2345678901abcdef23
Get a key
- Sign in at tylellm.com/app
- Upgrade to Pro or Max
- Settings → API → Show or Copy
Rotate a key
If a key leaks, click Regenerate in the API panel. The old key is invalidated immediately. There's a hard rate-limit of 3 regenerations per 5 minutes per user.
Models
ty\e routes your request to the right model based on your plan. You don't need to specify a model in the request — the server picks the best one available to your account.
| Model | Plan | Best for | Speed |
|---|---|---|---|
| ty\e Fast (Engineering Depth) | Free, Pro, Max | Everyday engineering Q&A, calculations, CAD analysis, code generation. Hosted on L40S — solid 32B reasoning with snappy responses. | Fast |
| ty\e Pro (Priority Queue) | Pro, Max | Same model as Fast, plus priority queue and 10× higher daily message quota. Best for active project work with vision + audio attachments. | Priority |
| ty\e VA (Apex Complexity) | Max | Native ty\e engine — designed for highest-complexity multi-modal engineering tasks. Self-learning architecture. In active development — Q3 launch. | In dev |
| ty\e Enterprise (Custom fine-tune) | Organization | Your own data, your own model — fine-tuned on internal CAD libraries, standards, and historical drawings. Deployed on-prem or private cloud. | Custom |
model field in the request body is currently ignored — the highest-tier model your plan grants is selected automatically. Explicit model pinning is coming in v1.5.Engineer roadmap
Where each ty\e tier is heading. Dates are best-effort and may shift as model training completes.
| Tier | Now | Next | ETA |
|---|---|---|---|
| ty\e Fast | Qwen 32B-AWQ on L40S | QLoRA fine-tune on CAD schema (33K samples) for native <cad> output | Q3 2026 |
| ty\e Pro | Same as Fast + priority queue | Concurrent multi-modal sessions, vision-heavy attachments | Q3 2026 |
| ty\e VA | tyeng-VA 161M (in dev) | Scale to 1.3B + 7B native ty\e engine, self-learning loop | Q3 2026 |
| ty\e Enterprise | Custom fine-tune | Self-serve LoRA portal, on-prem images | Q1 2027 |
POST /chat
Send a list of messages, receive a streaming reply.
Request body
| Field | Type | Description |
|---|---|---|
| messages | array | Conversation so far. Each item: {role, content}. Required. |
| stream | boolean | true for SSE (recommended). false returns full response at end. Default true. |
| temperature | number | 0.0–1.0. Lower = more deterministic. Default 0.4. |
| top_p | number | Nucleus sampling. Default 0.85. |
| max_tokens | integer | Cap on reply length. Default 384 (auto-bumped for CAD/drawing prompts). |
Message roles
user— your promptassistant— previous AI reply (for multi-turn context)system— optional system override (rarely needed; Custom Instructions handle this server-side)
Example request
{
"messages": [
{ "role": "user", "content": "Calculate head loss in 50m of DN100 pipe at 5 L/s" }
],
"stream": true,
"temperature": 0.3
}
Streaming format
When stream: true, the server returns text/event-stream. Each line follows SSE convention: data: <json>\n\n. Read line by line and decode the JSON.
data: {"delta": "Using"}
data: {"delta": " Darcy-Weisbach"}
data: {"delta": " with"}
…
data: {"usage": {"prompt_tokens": 42, "completion_tokens": 218}}
data: [DONE]
Concatenate every delta in order. The final non-DONE event carries the usage object. After [DONE], close the stream.
Errors
Errors come back as standard HTTP status codes with a JSON body.
| Status | Reason | Action |
|---|---|---|
| 401 | Missing / invalid API key | Check Authorization header. Confirm key not regenerated. |
| 403 | Plan downgraded to Free | Re-upgrade to Pro or Max to restore API access. |
| 429 | Rate limit exceeded | Back off. Retry-After header tells you when. |
| 500 | Internal error | Transient — retry with exponential backoff. Contact support if persistent. |
{ "detail": "Invalid API key" }
Rate limits
Limits are per-account, not per-key. Regenerating a key does not reset the daily counter.
| Plan | Messages / day | Burst |
|---|---|---|
| Pro | 1,000 | 60 / minute |
| Max | Unlimited | 120 / minute |
When you hit a limit the server returns 429 with a Retry-After header indicating seconds to wait. Implement exponential backoff in production clients.
Code samples
# Streaming chat completion curl -N -X POST https://tylellm.com/chat \ -H "Authorization: Bearer $TYLELLM_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "messages": [ {"role": "user", "content": "Calculate head loss in 50m of DN100 pipe at 5 L/s"} ], "stream": true }'
import os import requests # Streaming chat completion with requests.post( "https://tylellm.com/chat", headers={ "Authorization": f"Bearer {os.environ['TYLELLM_API_KEY']}", "Content-Type": "application/json", }, json={ "messages": [{"role": "user", "content": "Calculate head loss in 50m of DN100 pipe at 5 L/s"}], "stream": True, }, stream=True, ) as r: for line in r.iter_lines(): if line and line.startswith(b"data: "): print(line[6:].decode(), flush=True)
import fetch from 'node-fetch'; const res = await fetch('https://tylellm.com/chat', { method: 'POST', headers: { 'Authorization': `Bearer ${process.env.TYLELLM_API_KEY}`, 'Content-Type': 'application/json', }, body: JSON.stringify({ messages: [{ role: 'user', content: 'Calculate head loss in 50m of DN100 pipe at 5 L/s' }], stream: true, }), }); const reader = res.body.getReader(); const decoder = new TextDecoder(); while (true) { const { value, done } = await reader.read(); if (done) break; process.stdout.write(decoder.decode(value)); }
package main import ( "bufio" "bytes" "fmt" "net/http" "os" ) func main() { body := bytes.NewBufferString(`{ "messages": [{"role":"user","content":"Calculate head loss in 50m of DN100 pipe at 5 L/s"}], "stream": true }`) req, _ := http.NewRequest("POST", "https://tylellm.com/chat", body) req.Header.Set("Authorization", "Bearer "+os.Getenv("TYLELLM_API_KEY")) req.Header.Set("Content-Type", "application/json") res, _ := http.DefaultClient.Do(req) defer res.Body.Close() sc := bufio.NewScanner(res.Body) for sc.Scan() { fmt.Println(sc.Text()) } }
Ready to ship?
Upgrade to Pro to issue your first API key — keep your existing chat history, sync across devices, and unlock advanced models.
Get started