Chat API - SundayPyjamas AI Suite

Overview

The Chat API provides access to powerful language models for conversational AI, content generation, and text completion tasks. Built for developers who need reliable, scalable AI solutions with streaming responses.

All responses are streamed in real-time, providing a better user experience for conversational applications.

Base URL

POST /api/v1/chat

Authentication

All requests require a valid API key in the Authorization header:

Authorization: Bearer spj_ai_your_api_key_here

Learn more about API key generation and management.

Request Format

Required Headers

Header	Value	Description
`Authorization`	`Bearer spj_ai_[your_api_key]`	Required - Your API key for authentication
`Content-Type`	`application/json`	Required - Must be set to application/json

Request Body

messages

Array<Message>

required

Array of conversation messages. Must contain at least one message.

model

string

default:"llama-3.3-70b-versatile"

AI model to use for generating responses. Optional parameter.

Message Object

Each message in the messages array must contain:

role

string

required

The role of the message sender. Must be one of:

user - Messages from the user/human
assistant - Previous AI responses
system - System instructions to guide AI behavior

content

string

required

The actual message content. Cannot be empty.

Available Models

Model	Description	Best For
`llama-3.3-70b-versatile`	High-quality general-purpose model (default)	Most use cases, balanced performance

More models will be available soon! Check back for updates on specialized models.

Response Format

The API returns a streaming text response with the following headers:

Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive

Response Body

The response is streamed as text chunks. Concatenate all chunks to get the complete AI response.

Hello! Here's a professional email greeting:

Dear [Recipient's Name],

I hope this email finds you well. I wanted to reach out regarding...

Examples

Basic Chat Request

curl -X POST https://suite.sundaypyjamas.com/api/v1/chat \
  -H "Authorization: Bearer spj_ai_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "Hello! Can you help me write a professional email?"
      }
    ]
  }'

Multi-turn Conversation

curl -X POST https://suite.sundaypyjamas.com/api/v1/chat \
  -H "Authorization: Bearer spj_ai_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of France?"
      },
      {
        "role": "assistant", 
        "content": "The capital of France is Paris."
      },
      {
        "role": "user",
        "content": "What about its population?"
      }
    ]
  }'

With System Message

curl -X POST https://suite.sundaypyjamas.com/api/v1/chat \
  -H "Authorization: Bearer spj_ai_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "system",
        "content": "You are a professional copywriter specializing in marketing content."
      },
      {
        "role": "user",
        "content": "Write a product description for wireless headphones."
      }
    ]
  }'

Specifying Model

curl -X POST https://suite.sundaypyjamas.com/api/v1/chat \
  -H "Authorization: Bearer spj_ai_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "Explain quantum computing in simple terms."
      }
    ],
    "model": "llama-3.3-70b-versatile"
  }'

Streaming Response Handling

The API returns responses as a stream of text chunks. Here’s how to handle streaming in different languages:

const response = await fetch('https://suite.sundaypyjamas.com/api/v1/chat', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer spj_ai_your_api_key_here',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    messages: [
      { role: 'user', content: 'Hello!' }
    ]
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();
let fullResponse = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  const chunk = decoder.decode(value);
  fullResponse += chunk;
  console.log(chunk); // Process each chunk as it arrives
}

console.log('Complete response:', fullResponse);

Token Usage and Billing

Input Tokens

Counted based on the total length of all messages in your request

Output Tokens

Counted based on the length of the AI’s response

Usage Tracking

Token usage is tracked and counted toward your workspace limits

Optimization

Monitor usage through workspace analytics to optimize costs

Token Estimation

Roughly 4 characters = 1 token for English text. The API uses the same tokenization as the underlying model for precise counting.

Example calculation:

Input: "Hello, how are you today?" (26 characters) ≈ 7 tokens
Output: "I'm doing well, thank you for asking!" (36 characters) ≈ 9 tokens
Total: ~16 tokens

Common Use Cases

Content Generation

Blog Post Writing

{
  "messages": [
    {
      "role": "system",
      "content": "You are a creative content writer specializing in blog posts."
    },
    {
      "role": "user", 
      "content": "Write an introduction for a blog post about sustainable living tips."
    }
  ]
}

Email Writing

{
  "messages": [
    {
      "role": "system",
      "content": "You are a professional email writer. Write clear, polite, and effective emails."
    },
    {
      "role": "user",
      "content": "Write a follow-up email for a job interview."
    }
  ]
}

Marketing Copy

{
  "messages": [
    {
      "role": "system",
      "content": "You are an expert copywriter. Write compelling marketing copy that drives action."
    },
    {
      "role": "user",
      "content": "Create homepage copy for a productivity app targeting busy professionals."
    }
  ]
}

Code Assistance

Code Generation

{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful programming assistant. Provide clean, well-documented code."
    },
    {
      "role": "user",
      "content": "Write a Python function to calculate the fibonacci sequence."
    }
  ]
}

Code Review

{
  "messages": [
    {
      "role": "system",
      "content": "You are a senior software engineer. Provide constructive code reviews."
    },
    {
      "role": "user",
      "content": "Review this JavaScript function and suggest improvements: [code here]"
    }
  ]
}

Debugging Help

{
  "messages": [
    {
      "role": "system",
      "content": "You are a debugging expert. Help identify and fix code issues."
    },
    {
      "role": "user",
      "content": "I'm getting a TypeError in this Python code. Can you help me fix it?"
    }
  ]
}

Error Handling

Common Errors

Invalid API Key (401)

{
  "error": "Invalid API key"
}

Causes:

API key doesn’t exist or has been deleted
Incorrect API key format
Missing Authorization header

Solutions:

Verify your API key is correct and active
Check the Authorization header format: Bearer spj_ai_...
Generate a new API key if necessary

Missing Messages (400)

{
  "error": "Messages array is required"
}

Cause: Request body doesn’t include a messages arraySolution: Ensure your request includes a valid messages array with at least one message

Invalid Message Format (400)

{
  "error": "Last message must have valid content"
}

Causes:

Message missing required content field
Empty content string
Invalid role value

Solution: Ensure all messages have valid role and non-empty content fields

Token Limit Exceeded (403)

{
  "error": "Token limit exceeded"
}

Causes:

Workspace has exceeded monthly token quota
Request is too large

Solutions:

Wait for monthly token reset
Upgrade subscription plan
Optimize prompts to use fewer tokens

Rate Limited (429)

{
  "error": "Rate limit exceeded"
}

Cause: Making requests too quicklySolutions:

Implement exponential backoff
Reduce request frequency
Use batch processing for multiple prompts

Server Error (500)

{
  "error": "Failed to generate response"
}

Causes:

AI model temporarily unavailable
Server overload
Temporary service disruption

Solution: Implement retry logic with exponential backoff

Best Practices

Message Design

Clear Instructions

Use specific, clear prompts for better results. Be explicit about what you want.

System Messages

Use system messages to set context and guide AI behavior for specialized tasks.

Conversation Context

Include relevant conversation history, but keep it concise to manage token usage.

Structured Prompts

Break complex requests into clear, structured instructions.

Performance Optimization

Handle Streaming Responses

Stream responses to provide better user experience in conversational applications.

// ✅ Good - Handle streaming for real-time display
const processStreamingResponse = async (response) => {
  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    
    const chunk = decoder.decode(value);
    displayChunk(chunk); // Update UI immediately
  }
};

Implement Error Handling

Always implement proper error handling and retry logic.

// ✅ Good - Robust error handling
const makeRequest = async (messages, retries = 3) => {
  for (let i = 0; i < retries; i++) {
    try {
      const response = await fetch('/api/v1/chat', {
        method: 'POST',
        headers: {
          'Authorization': `Bearer ${apiKey}`,
          'Content-Type': 'application/json'
        },
        body: JSON.stringify({ messages })
      });
      
      if (!response.ok) {
        const error = await response.json();
        throw new Error(error.error);
      }
      
      return response;
    } catch (error) {
      if (i === retries - 1) throw error;
      await new Promise(resolve => setTimeout(resolve, 1000 * Math.pow(2, i)));
    }
  }
};

Monitor Token Usage

Track token usage to stay within limits and optimize costs.

// ✅ Good - Track usage
const estimateTokens = (text) => Math.ceil(text.length / 4);

const makeRequestWithTracking = async (messages) => {
  const inputTokens = messages.reduce((sum, msg) => sum + estimateTokens(msg.content), 0);
  console.log(`Estimated input tokens: ${inputTokens}`);
  
  // Make request and track output tokens
  const response = await makeRequest(messages);
  const output = await readStreamingResponse(response);
  const outputTokens = estimateTokens(output);
  
  console.log(`Total tokens used: ${inputTokens + outputTokens}`);
  return output;
};

Security Considerations

Never expose API keys in client-side code. Always use backend proxies for frontend applications.

Input Validation

Validate and sanitize user inputs before sending to the API

Content Filtering

Implement content filtering for user-generated prompts

Rate Limiting

Implement application-side rate limiting to prevent abuse

Monitoring

Monitor API usage patterns for unusual activity

Rate Limits

For detailed information about rate limits, token usage, and optimization strategies, see the Rate Limits guide.

Token-based limits: Usage counts toward workspace token quotas
Request rate: Standard rate limiting applies to prevent abuse
Concurrent requests: Multiple simultaneous requests are supported
Fair usage: Excessive usage may be throttled

Next Steps

Code Examples

View complete implementation examples in JavaScript, Python, and cURL

Rate Limits

Learn about token usage, optimization, and billing

Error Handling

Comprehensive guide to error codes and recovery patterns

API Reference

Complete API reference with schemas and interactive examples

Getting Started

API Documentation

Code Examples

Development

​Overview

​Base URL

​Authentication

​Request Format

​Required Headers

​Request Body

​Message Object

​Available Models

​Response Format

​Response Body

​Examples

​Basic Chat Request

​Multi-turn Conversation

​With System Message

​Specifying Model

​Streaming Response Handling

​Token Usage and Billing

Input Tokens

Output Tokens

Usage Tracking

Optimization

​Token Estimation

​Common Use Cases

​Content Generation

​Code Assistance

​Error Handling

​Common Errors

​Best Practices

​Message Design

Clear Instructions

System Messages

Conversation Context

Structured Prompts

​Performance Optimization

​Security Considerations

Input Validation

Content Filtering

Rate Limiting

Monitoring

​Rate Limits

​Next Steps

Code Examples

Rate Limits

Error Handling

API Reference

Overview

Base URL

Authentication

Request Format

Required Headers

Request Body

Message Object

Available Models

Response Format

Response Body

Examples

Basic Chat Request

Multi-turn Conversation

With System Message

Specifying Model

Streaming Response Handling

Token Usage and Billing

Token Estimation

Common Use Cases

Content Generation

Code Assistance

Error Handling

Common Errors

Best Practices

Message Design

Performance Optimization

Security Considerations

Rate Limits

Next Steps