Drop-in replacement for OpenAI with built-in analytics and cost optimization
Replace your OpenAI endpoint with Neurometric in just one line. All your existing code works exactly the same.
Just change the base_url. Everything else stays the same.
from openai import OpenAI
client = OpenAI(
api_key=os.environ.get("NEUROMETRIC_API_KEY"),
base_url="https://api.neurometric.ai/v1"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.NEUROMETRIC_API_KEY,
baseURL: 'https://api.neurometric.ai/v1',
});
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'user', content: 'Hello!' }
],
});
console.log(response.choices[0].message.content);curl -X POST https://api.neurometric.ai/v1/chat/completions \
-H "Authorization: Bearer $NEUROMETRIC_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}]
}'Analytics are automatic. Every request through Neurometric is tracked, analyzed, and available in your dashboard without any extra code or setup.
Monitor all API calls, models used, tokens consumed, and response times in real-time.
Automatic cost breakdown by model, task type, and time period. Identify optimization opportunities.
Track latency, throughput, error rates, and success metrics across all your LLM requests.
Get AI-powered suggestions to reduce costs while maintaining quality based on your usage patterns.
Access detailed analytics, usage reports, and optimization recommendations.
Open DashboardAll API requests require authentication using a Bearer token in the Authorization header.
Authorization: Bearer YOUR_API_KEY/chat/completionsCreate a chat completion. Fully compatible with OpenAI's chat completions API format.
Use any model ID from OpenAI, Anthropic, Google, or other providers. Neurometric automatically routes to the correct provider.
OpenAI
gpt-4o, gpt-4-turbo, gpt-3.5-turbo
Anthropic
claude-3-5-sonnet, claude-3-opus
gemini-pro, gemini-flash
Meta
llama-3-70b, llama-3-8b
Mistral
mistral-large, mistral-7b
...and more
100+ models supported
modelstringID of the model to use. See supported models above.
Example: "gpt-4o", "claude-3-5-sonnet"
messagesarrayArray of message objects with role and content.
Roles: "system", "user", "assistant"
temperaturenumberSampling temperature between 0 and 2. Higher values = more random.
Default: 1
max_tokensintegerMaximum number of tokens to generate in the response.
Model-dependent default
top_pnumberNucleus sampling: consider tokens with top_p probability mass.
Default: 1
stopstring | arraySequences where the API will stop generating tokens.
Up to 4 sequences
{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "What is the capital of France?"
}
],
"temperature": 0.7,
"max_tokens": 150
}idstringUnique identifier for the completion.
modelstringThe model that generated the response.
choicesarrayArray of completion choices. Each contains:
message - The generated message object
finish_reason - Why generation stopped ("stop", "length", etc.)
index - Index of this choice
usageobjectToken usage statistics:
prompt_tokens - Tokens in the prompt
completion_tokens - Tokens in the response
total_tokens - Total tokens used
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1706745600,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 8,
"total_tokens": 33
}
}Errors follow a standard format with HTTP status codes and descriptive messages:
{ "error": { "message": "...", "type": "...", "code": "..." } }401 - Invalid API key
429 - Rate limit exceeded
500 - Server error
If you're using TypeScript, here are the type definitions for request and response objects.
// Message in a conversation
interface Message {
role: 'system' | 'user' | 'assistant';
content: string;
}
// Request body for /chat/completions
interface ChatCompletionRequest {
model: string;
messages: Message[];
temperature?: number; // 0-2, default 1
max_tokens?: number; // Model-dependent
top_p?: number; // 0-1, default 1
stop?: string | string[]; // Up to 4 sequences
}
// Response from /chat/completions
interface ChatCompletionResponse {
id: string;
object: 'chat.completion';
created: number;
model: string;
choices: {
index: number;
message: Message;
finish_reason: 'stop' | 'length' | 'content_filter';
}[];
usage: {
prompt_tokens: number;
completion_tokens: number;
total_tokens: number;
};
}Get your Neurometric API key
Create an account and generate an API key
Update your base URL
Change https://api.openai.com/v1 to https://api.neurometric.ai/v1
Replace your API key
Use your Neurometric API key instead of your OpenAI key
No other code changes required. Neurometric uses the same request/response format as OpenAI, so your existing code works without modification. You can even continue using the official OpenAI SDK.
Need help? We're here for you.