Skip to main content

Overview

The Chat API provides access to powerful language models for conversational AI, content generation, and text completion tasks. Built for developers who need reliable, scalable AI solutions with streaming responses.
All responses are streamed in real-time, providing a better user experience for conversational applications.

Base URL

POST /api/v1/chat

Authentication

All requests require a valid API key in the Authorization header:
Authorization: Bearer spj_ai_your_api_key_here

Request Format

Required Headers

HeaderValueDescription
AuthorizationBearer spj_ai_[your_api_key]Required - Your API key for authentication
Content-Typeapplication/jsonRequired - Must be set to application/json

Request Body

messages
Array<Message>
required
Array of conversation messages. Must contain at least one message.
model
string
default:"llama-3.3-70b-versatile"
AI model to use for generating responses. Optional parameter.

Message Object

Each message in the messages array must contain:
role
string
required
The role of the message sender. Must be one of:
  • user - Messages from the user/human
  • assistant - Previous AI responses
  • system - System instructions to guide AI behavior
content
string
required
The actual message content. Cannot be empty.

Available Models

ModelDescriptionBest For
llama-3.3-70b-versatileHigh-quality general-purpose model (default)Most use cases, balanced performance
More models will be available soon! Check back for updates on specialized models.

Response Format

The API returns a streaming text response with the following headers:
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive

Response Body

The response is streamed as text chunks. Concatenate all chunks to get the complete AI response.
Hello! Here's a professional email greeting:

Dear [Recipient's Name],

I hope this email finds you well. I wanted to reach out regarding...

Examples

Basic Chat Request

curl -X POST https://suite.sundaypyjamas.com/api/v1/chat \
  -H "Authorization: Bearer spj_ai_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "Hello! Can you help me write a professional email?"
      }
    ]
  }'

Multi-turn Conversation

curl -X POST https://suite.sundaypyjamas.com/api/v1/chat \
  -H "Authorization: Bearer spj_ai_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of France?"
      },
      {
        "role": "assistant", 
        "content": "The capital of France is Paris."
      },
      {
        "role": "user",
        "content": "What about its population?"
      }
    ]
  }'

With System Message

curl -X POST https://suite.sundaypyjamas.com/api/v1/chat \
  -H "Authorization: Bearer spj_ai_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "system",
        "content": "You are a professional copywriter specializing in marketing content."
      },
      {
        "role": "user",
        "content": "Write a product description for wireless headphones."
      }
    ]
  }'

Specifying Model

curl -X POST https://suite.sundaypyjamas.com/api/v1/chat \
  -H "Authorization: Bearer spj_ai_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "Explain quantum computing in simple terms."
      }
    ],
    "model": "llama-3.3-70b-versatile"
  }'

Streaming Response Handling

The API returns responses as a stream of text chunks. Here’s how to handle streaming in different languages:
const response = await fetch('https://suite.sundaypyjamas.com/api/v1/chat', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer spj_ai_your_api_key_here',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    messages: [
      { role: 'user', content: 'Hello!' }
    ]
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();
let fullResponse = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  const chunk = decoder.decode(value);
  fullResponse += chunk;
  console.log(chunk); // Process each chunk as it arrives
}

console.log('Complete response:', fullResponse);

Token Usage and Billing

Input Tokens

Counted based on the total length of all messages in your request

Output Tokens

Counted based on the length of the AI’s response

Usage Tracking

Token usage is tracked and counted toward your workspace limits

Optimization

Monitor usage through workspace analytics to optimize costs

Token Estimation

Roughly 4 characters = 1 token for English text. The API uses the same tokenization as the underlying model for precise counting.
Example calculation:
Input: "Hello, how are you today?" (26 characters) ≈ 7 tokens
Output: "I'm doing well, thank you for asking!" (36 characters) ≈ 9 tokens
Total: ~16 tokens

Common Use Cases

Content Generation

{
  "messages": [
    {
      "role": "system",
      "content": "You are a creative content writer specializing in blog posts."
    },
    {
      "role": "user", 
      "content": "Write an introduction for a blog post about sustainable living tips."
    }
  ]
}
{
  "messages": [
    {
      "role": "system",
      "content": "You are a professional email writer. Write clear, polite, and effective emails."
    },
    {
      "role": "user",
      "content": "Write a follow-up email for a job interview."
    }
  ]
}
{
  "messages": [
    {
      "role": "system",
      "content": "You are an expert copywriter. Write compelling marketing copy that drives action."
    },
    {
      "role": "user",
      "content": "Create homepage copy for a productivity app targeting busy professionals."
    }
  ]
}

Code Assistance

{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful programming assistant. Provide clean, well-documented code."
    },
    {
      "role": "user",
      "content": "Write a Python function to calculate the fibonacci sequence."
    }
  ]
}
{
  "messages": [
    {
      "role": "system",
      "content": "You are a senior software engineer. Provide constructive code reviews."
    },
    {
      "role": "user",
      "content": "Review this JavaScript function and suggest improvements: [code here]"
    }
  ]
}
{
  "messages": [
    {
      "role": "system",
      "content": "You are a debugging expert. Help identify and fix code issues."
    },
    {
      "role": "user",
      "content": "I'm getting a TypeError in this Python code. Can you help me fix it?"
    }
  ]
}

Error Handling

Common Errors

{
  "error": "Invalid API key"
}
Causes:
  • API key doesn’t exist or has been deleted
  • Incorrect API key format
  • Missing Authorization header
Solutions:
  • Verify your API key is correct and active
  • Check the Authorization header format: Bearer spj_ai_...
  • Generate a new API key if necessary
{
  "error": "Messages array is required"
}
Cause: Request body doesn’t include a messages arraySolution: Ensure your request includes a valid messages array with at least one message
{
  "error": "Last message must have valid content"
}
Causes:
  • Message missing required content field
  • Empty content string
  • Invalid role value
Solution: Ensure all messages have valid role and non-empty content fields
{
  "error": "Token limit exceeded"
}
Causes:
  • Workspace has exceeded monthly token quota
  • Request is too large
Solutions:
  • Wait for monthly token reset
  • Upgrade subscription plan
  • Optimize prompts to use fewer tokens
{
  "error": "Rate limit exceeded"
}
Cause: Making requests too quicklySolutions:
  • Implement exponential backoff
  • Reduce request frequency
  • Use batch processing for multiple prompts
{
  "error": "Failed to generate response"
}
Causes:
  • AI model temporarily unavailable
  • Server overload
  • Temporary service disruption
Solution: Implement retry logic with exponential backoff

Best Practices

Message Design

Clear Instructions

Use specific, clear prompts for better results. Be explicit about what you want.

System Messages

Use system messages to set context and guide AI behavior for specialized tasks.

Conversation Context

Include relevant conversation history, but keep it concise to manage token usage.

Structured Prompts

Break complex requests into clear, structured instructions.

Performance Optimization

Stream responses to provide better user experience in conversational applications.
// ✅ Good - Handle streaming for real-time display
const processStreamingResponse = async (response) => {
  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    
    const chunk = decoder.decode(value);
    displayChunk(chunk); // Update UI immediately
  }
};
Always implement proper error handling and retry logic.
// ✅ Good - Robust error handling
const makeRequest = async (messages, retries = 3) => {
  for (let i = 0; i < retries; i++) {
    try {
      const response = await fetch('/api/v1/chat', {
        method: 'POST',
        headers: {
          'Authorization': `Bearer ${apiKey}`,
          'Content-Type': 'application/json'
        },
        body: JSON.stringify({ messages })
      });
      
      if (!response.ok) {
        const error = await response.json();
        throw new Error(error.error);
      }
      
      return response;
    } catch (error) {
      if (i === retries - 1) throw error;
      await new Promise(resolve => setTimeout(resolve, 1000 * Math.pow(2, i)));
    }
  }
};
Track token usage to stay within limits and optimize costs.
// ✅ Good - Track usage
const estimateTokens = (text) => Math.ceil(text.length / 4);

const makeRequestWithTracking = async (messages) => {
  const inputTokens = messages.reduce((sum, msg) => sum + estimateTokens(msg.content), 0);
  console.log(`Estimated input tokens: ${inputTokens}`);
  
  // Make request and track output tokens
  const response = await makeRequest(messages);
  const output = await readStreamingResponse(response);
  const outputTokens = estimateTokens(output);
  
  console.log(`Total tokens used: ${inputTokens + outputTokens}`);
  return output;
};

Security Considerations

Never expose API keys in client-side code. Always use backend proxies for frontend applications.

Input Validation

Validate and sanitize user inputs before sending to the API

Content Filtering

Implement content filtering for user-generated prompts

Rate Limiting

Implement application-side rate limiting to prevent abuse

Monitoring

Monitor API usage patterns for unusual activity

Rate Limits

For detailed information about rate limits, token usage, and optimization strategies, see the Rate Limits guide.
  • Token-based limits: Usage counts toward workspace token quotas
  • Request rate: Standard rate limiting applies to prevent abuse
  • Concurrent requests: Multiple simultaneous requests are supported
  • Fair usage: Excessive usage may be throttled

Next Steps