GPT Image
GPT Image is a series of image generation and editing models developed by OpenAI. A text-to-image variant of the GPT family, it uses deep learning methodologies to generate digital images from natural language descriptions or images precisely. As the successor to DALL-E, GPT Image is native to ChatGPT and available through the API. Upon release in March 2025, GPT Image went viral on social media, particularly for its capability of generating images in the style of Studio Ghibli. GPT Image is also available with Microsoft Copilot and Apple Intelligence as well.
History
The first model of GPT Image was revealed by OpenAI as the "GPT-4o image generation" in a blog post on March 25, 2025, developed based on the GPT-4o model to generate images. It was initially made available to only paid users, with the rollout to free users delayed due to high demands. The use of the feature was subsequently limited, with Sam Altman saying that the GPUs were "melting" from the level of use. OpenAI later said that over 130 million users around the world had created more than 700 million images in the first week. The model was named as GPT Image 1 and introduced to the API on April 23. A cost-efficient version was released as GPT Image 1 Mini on October 6, also OpenAI DevDay 2025, with the cost in the API 80% less expensive than GPT Image 1.A new model named GPT Image 1.5 was introduced on December 16, which was rolled out globally as the "ChatGPT Images" to all users and immediately made available via the API. OpenAI claimed that the new model can make precise edits while keeping details intact, and generates images up to four times faster. Image inputs and outputs in the API are 20% cheaper in GPT Image 1.5 as compared to GPT Image 1.
Capabilities
Unlike the diffusion predecessors of DALL-2 and DALL-3 models, GPT Image models are autoregressive with several new capabilities including image-to-image transformation, advanced photorealism and detailed instruction following. GPT Image can generate images in three sizes, namely 1024 × 1024, 1536 × 1024, and 1024 × 1536 pixels.GPT Image 1.5 addresses premature cropping and the warm color bias from the previous model, but it has regressed for generating in some specific art styles. Moreover, the weakness of multiple faces and some languages such as Chinese, Arabic, Hebrew, etc. still remains with the latest model.