OpenAI Launches GPT-5.5 — Its Most Powerful AI Model for Agentic Work
OpenAI has released GPT-5.5, a flagship AI model designed for autonomous multi-step task completion, agent management, and tool usage. The model is available in ChatGPT and Codex.
OpenAI released GPT-5.5 on April 23 — its new flagship AI model positioned as a fundamental leap in autonomous task execution and agent management. The model is already available to ChatGPT and Codex users.
«Introducing GPT-5.5 — A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done. Now available in ChatGPT and Codex.» — OpenAI (@OpenAI), original post
Why This Matters
GPT-5.5 represents a shift from models requiring constant user oversight to a system capable of independently planning multi-step workflows, selecting tools, verifying its own output, and completing tasks end-to-end. For the AI industry, this signals the continued evolution of the agentic paradigm — where models handle entire workflows rather than individual queries.
According to OpenAI, GPT-5.5 delivers higher intelligence without sacrificing speed: its per-token latency matches GPT-5.4 in real-world conditions while offering substantially stronger problem-solving capabilities. The model also consumes significantly fewer tokens when operating in Codex.

Agentic Programming and Code
GPT-5.5 is OpenAI's most capable solution for agentic programming. On Terminal-Bench 2.0, which evaluates complex command-line scenarios, the model achieved 82.7% accuracy. Its SWE-Bench Pro score reached 58.6%, and on Expert-SWE it surpassed GPT-5.4. Across all three benchmarks, token consumption was lower than the previous generation.

Within Codex, the model handles engineering tasks spanning implementation, refactoring, debugging, testing, and validation. OpenAI states GPT-5.5 demonstrates deeper system-level understanding: it grasps why something fails, identifies where fixes are needed, and anticipates which code sections will be affected.

The model significantly outperforms both GPT-5.4 and Claude Opus 4.7 in logical reasoning and autonomy — proactively identifying issues and predicting testing and review needs without explicit prompts.
Professional and Intellectual Task Performance
On GDPval, which tests agent capabilities across 44 professions, GPT-5.5 scored 84.9%. It achieved 78.7% on OSWorld-Verified and 98% on Tau2-bench. Additional results include 60% on FinanceAgent, 88.5% on internal investment banking modeling tasks, and 54.1% on OfficeQA Pro.


Scientific Research and Information Processing
In scientific workflows, GPT-5.5 can sequentially explore ideas, gather evidence, test hypotheses, and interpret data. On GeneBench — a platform for multi-step analysis in genetics and quantitative biology — the model improved upon GPT-5.4's scores. It also outperformed its predecessor on BixBench.

Over 85% of employees across OpenAI's various departments use Codex weekly — not just for software development but also in finance, communications, marketing, data analytics, and product management.

Availability and Pricing
GPT-5.5 is available in ChatGPT and Codex for Plus, Pro, Business, and Enterprise subscribers. A GPT-5.5 Pro variant has been released for Pro, Business, and Enterprise users. Both versions will soon be accessible via API at $5 million per 1 million input tokens and $30 million for output tokens. The context window spans 1 million tokens.
In Codex, the model is available for Plus, Pro, Business, Enterprise, Edu, and Go plans with a 400,000-token context window. GPT-5.5 also offers a Fast mode that generates tokens 1.5x faster at 2.5x the cost. While more expensive than GPT-5.4, OpenAI attributes the higher pricing to greater token efficiency.
Prior to launch, OpenAI applied what it described as its most comprehensive set of safety measures, collaborating with both internal and external specialists.
Frequently Asked Questions
What is GPT-5.5 and what makes it different?
GPT-5.5 is OpenAI's flagship AI model designed for autonomous multi-step task completion and agent management. It surpasses GPT-5.4 in intelligence while maintaining comparable latency and using fewer tokens.
How much does GPT-5.5 API access cost?
API pricing is set at $5 million per 1 million input tokens and $30 million for output tokens. The model offers a context window of 1 million tokens.
Which subscription plans support GPT-5.5?
GPT-5.5 is available in ChatGPT and Codex for Plus, Pro, Business, and Enterprise users. A GPT-5.5 Pro version is available for Pro, Business, and Enterprise subscribers. Codex additionally supports Edu and Go plans.
How does GPT-5.5 perform in programming benchmarks?
GPT-5.5 achieved 82.7% accuracy on Terminal-Bench 2.0, 58.6% on SWE-Bench Pro, and outperformed GPT-5.4 on Expert-SWE. It consumed fewer tokens across all three benchmarks compared to its predecessor.
Can GPT-5.5 handle scientific research tasks?
GPT-5.5 can sequentially explore ideas, gather evidence, test hypotheses, and interpret data. It showed improved results over GPT-5.4 on both GeneBench and BixBench scientific platforms.
Read also
OpenAI Secures Record $110 Billion Round at $730 Billion Valuation
OpenAI closed the largest startup funding round in history at $110 billion, backed by Amazon, SoftBank, and Nvidia, with a $730 billion valuation.
DeepSeek Launches V4-Pro: Open-Source Model Outperforms Claude Opus 4.6 and GPT-5.4
Chinese AI startup DeepSeek released a preview of its V4 model family, with the flagship V4-Pro boasting 1.6 trillion parameters and surpassing leading closed-source models in multiple benchmarks.
AI Audit Uncovers Critical Liveness Bug in Ethereum's Nethermind Client
Octane Security's AI discovered a high-severity vulnerability in the Nethermind execution client that could have halted block production for 38% of Ethereum mainnet validators. The Ethereum Foundation awarded a maximum $50,000 bounty.
How ERC-8004 and x402 Standards Are Turning AI Agents Into Market Participants
Blockchain standards ERC-8004 and x402 are building the infrastructure for autonomous machine payments, challenging Big Tech's closed ecosystems and the attention economy model.
AI Agent Transaction Volume Is 15x Lower Than Bloomberg's Estimates, Says a16z Partner
a16z partner Noah Levine challenged Bloomberg's AI agent payment data, showing actual on-chain volume at $1.6–3M versus the reported $24M.
Weekly Recap: Bitcoin Tests $74K, Miners Dump Holdings, ChatGPT Boycott Grows
Bitcoin briefly touched $74,000 before retreating to $67,500. Public miners sold over 15,000 BTC in five months, traders flocked to Hyperliquid for oil and gold futures, and a ChatGPT boycott gained major traction.
