Models API

Run inference against hosted AI models. Currently powered by Ollama.

Available Models

Model	Parameters	Provider	Description
`phi3:3.8b`	3.8B	Ollama	Microsoft Phi-3 Mini — compact, efficient model for general tasks

More models will be added over time.

List Available Models

GET /api/models

Public endpoint — no authentication required.

Response

{
  "models": [
    {
      "name": "phi3:3.8b",
      "provider": "ollama",
      "parameterSize": "3.8B",
      "description": "Microsoft Phi-3 Mini — compact, efficient model for general tasks"
    }
  ]
}

Run Inference

POST /api/models/infer

Requires: Authorization: Bearer YOUR_JWT_TOKEN

Request Body

Field	Type	Required	Default	Description
`model`	string	Yes	—	Model identifier (e.g. `phi3:3.8b`)
`prompt`	string	Yes	—	The prompt to send
`systemPrompt`	string	No	—	System instructions for the model
`temperature`	number	No	0.7	Sampling temperature (0.0–2.0)
`maxTokens`	number	No	1024	Maximum tokens to generate (1–8192)
`stream`	boolean	No	false	Whether to stream the response

Example

curl -X POST https://intelligence.cognitera.ai/api/models/infer \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "phi3:3.8b",
    "prompt": "Write a Python function to calculate fibonacci numbers",
    "systemPrompt": "You are a helpful programming assistant.",
    "temperature": 0.3,
    "maxTokens": 2048
  }'

Response

{
  "id": "inference-uuid",
  "model": "phi3:3.8b",
  "response": "Here's a Python function to calculate Fibonacci numbers:\n\n```python\ndef fibonacci(n):\n    if n <= 1:\n        return n\n    a, b = 0, 1\n    for _ in range(2, n + 1):\n        a, b = b, a + b\n    return b\n```\n\nThis iterative approach runs in O(n) time...",
  "tokensUsed": 187,
  "durationMs": 2340
}

Health Check

GET /api/models/health

Public endpoint — verify model provider connectivity.

Response

{
  "ollama": true
}

Available Models​

List Available Models​

Response​

Run Inference​

Request Body​

Example​

Response​

Health Check​

Response​

Available Models

List Available Models

Response

Run Inference

Request Body

Example

Response

Health Check

Response