Technology / Models

OpenAI models, in a real agent architecture

GPT-4o, the o-series reasoners, function calling, and the Responses API are powerful primitives. We wire them into governed agents that take action in your systems — and stay portable to other providers.

  • GPT-4o & o-series routing
  • Native function / tool calling
  • Structured outputs & JSON mode
  • Provider-agnostic by design
2
model families we route between — fast 4o and deliberate o-series
128K
token context window we plan agent memory around
<1s
first-token latency targets via streaming
0
lock-in: swap providers without rewriting agents
// what it is

A family of models, not a single tool

OpenAI ships several distinct capabilities — and using them well means knowing which one to call, and when.

OpenAI's lineup splits roughly in two. GPT-4o (and the smaller 4o-mini) is the workhorse: fast, multimodal, strong at instruction-following, native tool calling, and structured JSON output. The o-series — the reasoning models — trades latency and cost for deliberate, multi-step thinking on hard problems like planning, complex code, and tricky logic.

On top of the models sit the platform pieces we actually build against: function calling so the model can request tools, structured outputs so it returns schema-valid JSON instead of free text, embeddings for retrieval, and the Responses API for stateful, tool-using turns. Those are the surfaces an agent framework hooks into.

None of this is magic, and none of it is a moat. A model is one component of an agent. The value Automatic.co adds is the architecture around it — routing, guardrails, integration, and observability — which is exactly what keeps you free to change models later.

// where it fits

What we use OpenAI for

Capabilities map to roles inside an agent — we assign the right model to the right job.

// how we adopt it

How OpenAI enters a build

We don't start with the model. We start with the workflow and earn our way to it.

01

Scope the step

We define each agent step's job, its inputs and outputs, and its tolerance for latency, cost, and error.

02

Benchmark models

We test OpenAI head-to-head with alternatives on your real data — accuracy, latency, and cost per task.

03

Wire behind an interface

The chosen model sits behind a provider-agnostic adapter with guardrails, retries, and structured-output validation.

04

Instrument & route

We meter tokens, cost, and quality per call, then route or downshift models as production data comes in.

// vendor-honest

We use OpenAI. We're not married to it.

OpenAI is genuinely good at certain jobs — fast tool calling, multimodal input, and, with the o-series, hard reasoning. We reach for it when it's the best tool for a step, and we say so plainly when it isn't.

Every agent we build calls models through an abstraction layer, so the orchestration logic, guardrails, and integrations don't care who's behind the API. That means you can shift a step to Anthropic, an open-source model you host, or a future OpenAI release without re-architecting. Portability is a design requirement, not a nice-to-have.

  • Provider-agnostic adapter layer
  • Per-step model selection on the merits
  • Azure OpenAI for in-tenant deployments
  • Open-source fallback for air-gapped work

GPT-4o vs. the o-series

Same provider, different jobs — we route by what a step actually needs.

GPT-4o / 4o-minio-series reasoners
Best forTool calls, extraction, draftingPlanning, complex code, hard logic
SpeedLow latency, high throughputSlower — it deliberates before answering
Cost profileCheaper per tokenHigher — reserved for steps that need it
Where in the agentAction & I/O layerThe planner / decision core

Frequently asked questions

Does building on OpenAI lock us in?

No. We keep the model behind a provider-agnostic interface, so an agent that calls GPT-4o can fall back to Anthropic or an open-source model without rewriting the orchestration. We pick OpenAI where it earns its place — not as a default.

When do you reach for the o-series reasoning models over GPT-4o?

When a step needs multi-step deliberation — planning, math-heavy logic, code generation with verification, or untangling ambiguous instructions. For high-volume, latency-sensitive tool-calling and extraction, GPT-4o or 4o-mini is faster and cheaper, so we route by step.

Can OpenAI models run inside our security perimeter?

OpenAI is API-hosted, so calls leave your network. We deploy it through Azure OpenAI in your tenant when residency matters, scrub and tokenize sensitive fields before the call, and reserve fully air-gapped workloads for open-source models we host for you.

How do you control OpenAI cost and latency in production?

Per-step model routing, prompt caching, structured outputs to cut retries, streaming for perceived latency, and hard token and spend budgets per agent. We instrument every call so cost and quality are observable, not a surprise on the invoice.

Bring a workflow. We'll pick the right model.

One working session to scope your automation and benchmark OpenAI against the alternatives on your own data.