Overview
The Chat API provides access to powerful language models for conversational AI, content generation, and text completion tasks. Built for developers who need reliable, scalable AI solutions with streaming responses.All responses are streamed in real-time, providing a better user experience for conversational applications.
Base URL
Authentication
All requests require a valid API key in the Authorization header:Learn more about API key generation and management.
Request Format
Required Headers
| Header | Value | Description |
|---|---|---|
Authorization | Bearer spj_ai_[your_api_key] | Required - Your API key for authentication |
Content-Type | application/json | Required - Must be set to application/json |
Request Body
Array of conversation messages. Must contain at least one message.
AI model to use for generating responses. Optional parameter.
Message Object
Each message in themessages array must contain:
The role of the message sender. Must be one of:
user- Messages from the user/humanassistant- Previous AI responsessystem- System instructions to guide AI behavior
The actual message content. Cannot be empty.
Available Models
| Model | Description | Best For |
|---|---|---|
llama-3.3-70b-versatile | High-quality general-purpose model (default) | Most use cases, balanced performance |
Response Format
The API returns a streaming text response with the following headers:Response Body
The response is streamed as text chunks. Concatenate all chunks to get the complete AI response.Examples
Basic Chat Request
Multi-turn Conversation
With System Message
Specifying Model
Streaming Response Handling
The API returns responses as a stream of text chunks. Here’s how to handle streaming in different languages:Token Usage and Billing
Input Tokens
Counted based on the total length of all messages in your request
Output Tokens
Counted based on the length of the AI’s response
Usage Tracking
Token usage is tracked and counted toward your workspace limits
Optimization
Monitor usage through workspace analytics to optimize costs
Token Estimation
Example calculation:Common Use Cases
Content Generation
Blog Post Writing
Blog Post Writing
Email Writing
Email Writing
Marketing Copy
Marketing Copy
Code Assistance
Code Generation
Code Generation
Code Review
Code Review
Debugging Help
Debugging Help
Error Handling
Common Errors
Invalid API Key (401)
Invalid API Key (401)
- API key doesn’t exist or has been deleted
- Incorrect API key format
- Missing Authorization header
- Verify your API key is correct and active
- Check the Authorization header format:
Bearer spj_ai_... - Generate a new API key if necessary
Missing Messages (400)
Missing Messages (400)
messages arraySolution: Ensure your request includes a valid messages array with at least one messageInvalid Message Format (400)
Invalid Message Format (400)
- Message missing required
contentfield - Empty content string
- Invalid role value
role and non-empty content fieldsToken Limit Exceeded (403)
Token Limit Exceeded (403)
- Workspace has exceeded monthly token quota
- Request is too large
- Wait for monthly token reset
- Upgrade subscription plan
- Optimize prompts to use fewer tokens
Rate Limited (429)
Rate Limited (429)
- Implement exponential backoff
- Reduce request frequency
- Use batch processing for multiple prompts
Server Error (500)
Server Error (500)
- AI model temporarily unavailable
- Server overload
- Temporary service disruption
Best Practices
Message Design
Clear Instructions
Use specific, clear prompts for better results. Be explicit about what you want.
System Messages
Use system messages to set context and guide AI behavior for specialized tasks.
Conversation Context
Include relevant conversation history, but keep it concise to manage token usage.
Structured Prompts
Break complex requests into clear, structured instructions.
Performance Optimization
Handle Streaming Responses
Handle Streaming Responses
Stream responses to provide better user experience in conversational applications.
Implement Error Handling
Implement Error Handling
Always implement proper error handling and retry logic.
Monitor Token Usage
Monitor Token Usage
Track token usage to stay within limits and optimize costs.
Security Considerations
Input Validation
Validate and sanitize user inputs before sending to the API
Content Filtering
Implement content filtering for user-generated prompts
Rate Limiting
Implement application-side rate limiting to prevent abuse
Monitoring
Monitor API usage patterns for unusual activity
Rate Limits
For detailed information about rate limits, token usage, and optimization strategies, see the Rate Limits guide.
- Token-based limits: Usage counts toward workspace token quotas
- Request rate: Standard rate limiting applies to prevent abuse
- Concurrent requests: Multiple simultaneous requests are supported
- Fair usage: Excessive usage may be throttled
Next Steps
Code Examples
View complete implementation examples in JavaScript, Python, and cURL
Rate Limits
Learn about token usage, optimization, and billing
Error Handling
Comprehensive guide to error codes and recovery patterns
API Reference
Complete API reference with schemas and interactive examples

