🔧 Integration Guide

How It Works

Name: API Cost Monitor
Rating: 4.9 (500 reviews)
Author: API Cost Monitor

Get full cost visibility on your AI API usage in just 3 simple steps. No code changes required. Works with any programming language.

Integration in 3 Simple Steps

From signup to full cost tracking in under 5 minutes

Create Your Project & Add API Keys

Sign up for a free account, create a project, and securely store your AI provider API keys. We use AES-256 encryption to protect your credentials.

Supported Providers:

OpenAI Anthropic Mistral AI Replicate +10 more

Dashboard Encrypted

Project Name: My AI Application
Provider: OpenAI
API Key: sk-proj-********************
Status: ✅ Active

Budget Limit: $500/month
Current Usage: $127.45 (25.5%)
Remaining: $372.55

Replace API Base URL

Simply replace the provider's base URL with our proxy endpoint. Your unique project token routes requests through our cost tracking layer.

What happens:

We forward your request to the provider (no latency)
Calculate cost based on usage (tokens/images)
Log metadata (model, timestamp, latency)
Return the original response unchanged

Before → After One Line Change

- base_url = "https://api.openai.com/v1"
+ base_url = "https://proxy.apicostmonitor.com/v1"

headers = {
    "Authorization": "Bearer YOUR_OPENAI_KEY",
    "X-Project-Token": "acm_proj_abc123xyz"
}

Monitor Costs in Real-Time

Access your personalized dashboard to track costs per project, model, or time period. Set budget alerts and receive notifications before overspending.

Dashboard Features:

📊 Real-time cost graphs

📈 Usage analytics

🔔 Budget alerts

📥 Export to CSV/JSON

Dashboard Analytics Live

Today's Activity:
─────────────────────────────────
Requests: 1,247 (+12% vs yesterday)
Total Cost: $12.45
Avg Latency: 342ms

Top Models:
1. gpt-4o         $8.23 (66%)
2. claude-3-sonnet $2.89 (23%)
3. mistral-large  $1.33 (11%)

⚠️ Alert: Project "ChatBot" at 85% of budget

Integration Code Examples

Works with any language. Here are the most popular implementations.

from openai import OpenAI

# Initialize client with API Cost Monitor proxy
client = OpenAI(
    api_key="your-openai-api-key",
    base_url="https://proxy.apicostmonitor.com/v1",
    default_headers={
        "X-Project-Token": "acm_proj_YOUR_TOKEN_HERE"
    }
)

# Use exactly as before - no code changes needed!
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Explain quantum computing"}
    ]
)

print(response.choices[0].message.content)

# Cost tracking happens automatically in background!
# Check your dashboard for real-time cost updates

const OpenAI = require('openai');

// Initialize client with API Cost Monitor proxy
const client = new OpenAI({
  apiKey: 'your-openai-api-key',
  baseURL: 'https://proxy.apicostmonitor.com/v1',
  defaultHeaders: {
    'X-Project-Token': 'acm_proj_YOUR_TOKEN_HERE'
  }
});

// Use exactly as before - no code changes needed!
async function main() {
  const response = await client.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      { role: 'user', content: 'Explain quantum computing' }
    ]
  });

  console.log(response.choices[0].message.content);
}

main();

// Cost tracking happens automatically in background!
// Check your dashboard for real-time cost updates

curl https://proxy.apicostmonitor.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_OPENAI_API_KEY" \
  -H "X-Project-Token: acm_proj_YOUR_TOKEN_HERE" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "Explain quantum computing"
      }
    ]
  }'

# Response includes standard OpenAI format
# Cost tracking happens server-side automatically

<?php

// Using Guzzle HTTP client
use GuzzleHttp\Client;

$client = new Client([
    'base_uri' => 'https://proxy.apicostmonitor.com/v1/',
    'headers' => [
        'Authorization' => 'Bearer YOUR_OPENAI_API_KEY',
        'X-Project-Token' => 'acm_proj_YOUR_TOKEN_HERE',
        'Content-Type' => 'application/json'
    ]
]);

$response = $client->post('chat/completions', [
    'json' => [
        'model' => 'gpt-4o',
        'messages' => [
            ['role' => 'user', 'content' => 'Explain quantum computing']
        ]
    ]
]);

$data = json_decode($response->getBody(), true);
echo $data['choices'][0]['message']['content'];

// Cost tracking happens automatically!

Technical FAQ

Common questions about integration and performance

Our proxy adds less than 50ms of overhead on average. This includes:

10-20ms: Request validation and token extraction
5-10ms: Database logging (async, non-blocking)
20-30ms: Network routing to provider

For most AI API calls that take 500ms-5s, this represents <2% overhead. We use edge computing (Cloudflare Workers) to minimize latency globally.

We use the exact same pricing models as the providers:

Text Models: Count input/output tokens using tiktoken (OpenAI) or provider-specific tokenizers
Image Models: Parse resolution and steps from request parameters
Pricing Database: Updated daily with latest provider pricing (supports GPT-4, Claude, Mistral, Replicate, etc.)

Our cost calculations are typically accurate to ±$0.0001 compared to provider invoices.

No, we never store request content or responses. We only log metadata:

✅ Timestamp, model name, token counts, cost
✅ HTTP status code, latency (ms)
✅ Project ID (for grouping)
❌ Prompts, completions, or any user data
❌ IP addresses or identifying information

Your data flows directly from your app → our proxy → provider → back to your app. We're just a transparent middleware layer.

We guarantee 99.9% uptime with multiple redundancy layers:

Multi-region deployment: Edge workers in 200+ locations worldwide
Automatic failover: If one region fails, requests route to nearest healthy region
Rate limit protection: We cache provider rate limits to prevent cascading failures
Graceful degradation: If logging fails, requests still go through (monitoring continues when service recovers)

You can also implement a fallback: if our proxy returns 5xx errors, switch temporarily to direct provider URLs.

Yes, streaming is fully supported! We handle Server-Sent Events (SSE) transparently:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    stream=True  # ✅ Works perfectly!
)

for chunk in response:
    print(chunk.choices[0].delta.content)

Cost calculation happens after the full stream completes. Latency for streaming is identical to non-streaming requests.

✅ Currently Supported:

OpenAI: GPT-4o, GPT-4, GPT-3.5, DALL-E 3, Whisper, TTS
Anthropic: Claude 3.5 Sonnet, Claude 3 Opus/Haiku
Mistral AI: Mistral Large, Medium, Small
Replicate: Stable Diffusion, Flux, LLaMA models

🚧 Coming Soon:

Google Gemini / Vertex AI
Cohere
HuggingFace Inference
Azure OpenAI

Request a provider: Contact us if you need a specific provider integrated.

Ready to Start Tracking?

Join 500+ developers who trust API Cost Monitor for transparent AI cost tracking.

Start Free Trial Talk to Sales

No credit card required • 14-day free trial • Cancel anytime

How It Works

Integration in 3 Simple Steps

Create Your Project & Add API Keys

Replace API Base URL

Monitor Costs in Real-Time

Integration Code Examples

Technical FAQ

Does the proxy add latency to my API requests?

How do you calculate costs so accurately?

Do you store or read my API requests/responses?

What happens if your proxy goes down?

Can I use this with streaming responses (SSE)?

Which AI providers and models do you support?

✅ Currently Supported:

🚧 Coming Soon:

Ready to Start Tracking?