Skip to content
OpenAI Launches GPT-5.5 — Its Most Powerful AI Model for Agentic Work
AI3 min
4

OpenAI Launches GPT-5.5 — Its Most Powerful AI Model for Agentic Work

AnthropicAnthropicSTARTUP

OpenAI has released GPT-5.5, a flagship AI model designed for autonomous multi-step task completion, agent management, and tool usage. The model is available in ChatGPT and Codex.

📝
CoinJP Editorial
0
CoinJP Editorial · 0 articles

OpenAI released GPT-5.5 on April 23 — its new flagship AI model positioned as a fundamental leap in autonomous task execution and agent management. The model is already available to ChatGPT and Codex users.

«Introducing GPT-5.5 — A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done. Now available in ChatGPT and Codex.» — OpenAI (@OpenAI), original post

Why This Matters

GPT-5.5 represents a shift from models requiring constant user oversight to a system capable of independently planning multi-step workflows, selecting tools, verifying its own output, and completing tasks end-to-end. For the AI industry, this signals the continued evolution of the agentic paradigm — where models handle entire workflows rather than individual queries.

According to OpenAI, GPT-5.5 delivers higher intelligence without sacrificing speed: its per-token latency matches GPT-5.4 in real-world conditions while offering substantially stronger problem-solving capabilities. The model also consumes significantly fewer tokens when operating in Codex.

GPT-5.5 benchmark results across various tests
GPT-5.5 benchmark results across various tests. Source: OpenAI

Agentic Programming and Code

GPT-5.5 is OpenAI's most capable solution for agentic programming. On Terminal-Bench 2.0, which evaluates complex command-line scenarios, the model achieved 82.7% accuracy. Its SWE-Bench Pro score reached 58.6%, and on Expert-SWE it surpassed GPT-5.4. Across all three benchmarks, token consumption was lower than the previous generation.

GPT-5.5 programming performance
GPT-5.5 performance in programming tasks. Source: OpenAI

Within Codex, the model handles engineering tasks spanning implementation, refactoring, debugging, testing, and validation. OpenAI states GPT-5.5 demonstrates deeper system-level understanding: it grasps why something fails, identifies where fixes are needed, and anticipates which code sections will be affected.

GPT-5.5 compared to competitors in programming cost efficiency
GPT-5.5 compared to competitors in programming cost efficiency. Source: OpenAI

The model significantly outperforms both GPT-5.4 and Claude Opus 4.7 in logical reasoning and autonomy — proactively identifying issues and predicting testing and review needs without explicit prompts.

Professional and Intellectual Task Performance

On GDPval, which tests agent capabilities across 44 professions, GPT-5.5 scored 84.9%. It achieved 78.7% on OSWorld-Verified and 98% on Tau2-bench. Additional results include 60% on FinanceAgent, 88.5% on internal investment banking modeling tasks, and 54.1% on OfficeQA Pro.

GPT-5.5 results in professional benchmarks
GPT-5.5 results in professional benchmarks. Source: OpenAI
GPT-5.5 in financial and office tasks
GPT-5.5 in financial analysis and office tasks. Source: OpenAI

Scientific Research and Information Processing

In scientific workflows, GPT-5.5 can sequentially explore ideas, gather evidence, test hypotheses, and interpret data. On GeneBench — a platform for multi-step analysis in genetics and quantitative biology — the model improved upon GPT-5.4's scores. It also outperformed its predecessor on BixBench.

GPT-5.5 in scientific benchmarks
GPT-5.5 results in scientific research benchmarks. Source: OpenAI

Over 85% of employees across OpenAI's various departments use Codex weekly — not just for software development but also in finance, communications, marketing, data analytics, and product management.

GPT-5.5 in biological benchmarks
GPT-5.5 performance in biological benchmarks. Source: OpenAI

Availability and Pricing

GPT-5.5 is available in ChatGPT and Codex for Plus, Pro, Business, and Enterprise subscribers. A GPT-5.5 Pro variant has been released for Pro, Business, and Enterprise users. Both versions will soon be accessible via API at $5 million per 1 million input tokens and $30 million for output tokens. The context window spans 1 million tokens.

In Codex, the model is available for Plus, Pro, Business, Enterprise, Edu, and Go plans with a 400,000-token context window. GPT-5.5 also offers a Fast mode that generates tokens 1.5x faster at 2.5x the cost. While more expensive than GPT-5.4, OpenAI attributes the higher pricing to greater token efficiency.

Prior to launch, OpenAI applied what it described as its most comprehensive set of safety measures, collaborating with both internal and external specialists.

ai-agentsartificial-intelligencechatgptcodexgpt-5.5machine-learningopenai

Frequently Asked Questions

What is GPT-5.5 and what makes it different?

GPT-5.5 is OpenAI's flagship AI model designed for autonomous multi-step task completion and agent management. It surpasses GPT-5.4 in intelligence while maintaining comparable latency and using fewer tokens.

How much does GPT-5.5 API access cost?

API pricing is set at $5 million per 1 million input tokens and $30 million for output tokens. The model offers a context window of 1 million tokens.

Which subscription plans support GPT-5.5?

GPT-5.5 is available in ChatGPT and Codex for Plus, Pro, Business, and Enterprise users. A GPT-5.5 Pro version is available for Pro, Business, and Enterprise subscribers. Codex additionally supports Edu and Go plans.

How does GPT-5.5 perform in programming benchmarks?

GPT-5.5 achieved 82.7% accuracy on Terminal-Bench 2.0, 58.6% on SWE-Bench Pro, and outperformed GPT-5.4 on Expert-SWE. It consumed fewer tokens across all three benchmarks compared to its predecessor.

Can GPT-5.5 handle scientific research tasks?

GPT-5.5 can sequentially explore ideas, gather evidence, test hypotheses, and interpret data. It showed improved results over GPT-5.4 on both GeneBench and BixBench scientific platforms.

Read also

AI

OpenAI Secures Record $110 Billion Round at $730 Billion Valuation

OpenAI closed the largest startup funding round in history at $110 billion, backed by Amazon, SoftBank, and Nvidia, with a $730 billion valuation.

4 min·🔥 1
AI

DeepSeek Launches V4-Pro: Open-Source Model Outperforms Claude Opus 4.6 and GPT-5.4

Chinese AI startup DeepSeek released a preview of its V4 model family, with the flagship V4-Pro boasting 1.6 trillion parameters and surpassing leading closed-source models in multiple benchmarks.

3 min·🔥 0
AI

AI Audit Uncovers Critical Liveness Bug in Ethereum's Nethermind Client

Octane Security's AI discovered a high-severity vulnerability in the Nethermind execution client that could have halted block production for 38% of Ethereum mainnet validators. The Ethereum Foundation awarded a maximum $50,000 bounty.

3 min·🔥 1
Innovations

How ERC-8004 and x402 Standards Are Turning AI Agents Into Market Participants

Blockchain standards ERC-8004 and x402 are building the infrastructure for autonomous machine payments, challenging Big Tech's closed ecosystems and the attention economy model.

5 min·🔥 1
AI

AI Agent Transaction Volume Is 15x Lower Than Bloomberg's Estimates, Says a16z Partner

a16z partner Noah Levine challenged Bloomberg's AI agent payment data, showing actual on-chain volume at $1.6–3M versus the reported $24M.

3 min·🔥 0
AI

Weekly Recap: Bitcoin Tests $74K, Miners Dump Holdings, ChatGPT Boycott Grows

Bitcoin briefly touched $74,000 before retreating to $67,500. Public miners sold over 15,000 BTC in five months, traders flocked to Hyperliquid for oil and gold futures, and a ChatGPT boycott gained major traction.

5 min·🔥 1