Skip to main content

Models API

Run inference against hosted AI models. Currently powered by Ollama.

Available Models

ModelParametersProviderDescription
phi3:3.8b3.8BOllamaMicrosoft Phi-3 Mini — compact, efficient model for general tasks

More models will be added over time.


List Available Models

GET /api/models

Public endpoint — no authentication required.

Response

{
"models": [
{
"name": "phi3:3.8b",
"provider": "ollama",
"parameterSize": "3.8B",
"description": "Microsoft Phi-3 Mini — compact, efficient model for general tasks"
}
]
}

Run Inference

POST /api/models/infer

Requires: Authorization: Bearer YOUR_JWT_TOKEN

Request Body

FieldTypeRequiredDefaultDescription
modelstringYesModel identifier (e.g. phi3:3.8b)
promptstringYesThe prompt to send
systemPromptstringNoSystem instructions for the model
temperaturenumberNo0.7Sampling temperature (0.0–2.0)
maxTokensnumberNo1024Maximum tokens to generate (1–8192)
streambooleanNofalseWhether to stream the response

Example

curl -X POST https://intelligence.cognitera.ai/api/models/infer \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "phi3:3.8b",
"prompt": "Write a Python function to calculate fibonacci numbers",
"systemPrompt": "You are a helpful programming assistant.",
"temperature": 0.3,
"maxTokens": 2048
}'

Response

{
"id": "inference-uuid",
"model": "phi3:3.8b",
"response": "Here's a Python function to calculate Fibonacci numbers:\n\n```python\ndef fibonacci(n):\n if n <= 1:\n return n\n a, b = 0, 1\n for _ in range(2, n + 1):\n a, b = b, a + b\n return b\n```\n\nThis iterative approach runs in O(n) time...",
"tokensUsed": 187,
"durationMs": 2340
}

Health Check

GET /api/models/health

Public endpoint — verify model provider connectivity.

Response

{
"ollama": true
}