GPT Image

GPT Image is a series of image generation and editing models developed by OpenAI. A text-to-image variant of the GPT family, it uses deep learning methodologies to generate digital images from natural language descriptions or images precisely. As the successor to DALL-E, GPT Image is native to ChatGPT and available through the API. Upon release in March 2025, GPT Image went viral on social media, particularly for its capability of generating images in the style of Studio Ghibli. GPT Image is also available with Microsoft Copilot and Apple Intelligence as well.

History

The first model of GPT Image was revealed by OpenAI as the "GPT-4o image generation" in a blog post on March 25, 2025, developed based on the GPT-4o model to generate images. It was initially made available to only paid users, with the rollout to free users delayed due to high demands. The use of the feature was subsequently limited, with Sam Altman saying that the GPUs were "melting" from the level of use. OpenAI later said that over 130 million users around the world had created more than 700 million images in the first week⁠. The model was named as GPT Image 1 and introduced to the API on April 23. A cost-efficient version was released as GPT Image 1 Mini on October 6, also OpenAI DevDay 2025, with the cost in the API 80% less expensive than GPT Image 1.
A new model named GPT Image 1.5 was introduced on December 16, which was rolled out globally as the "ChatGPT Images" to all users and immediately made available via the API. OpenAI claimed that the new model can make precise edits while keeping details intact, and generates images up to four times faster. Image inputs and outputs in the API are 20% cheaper in GPT Image 1.5 as compared to GPT Image 1.

Capabilities

Unlike the diffusion predecessors of DALL-2 and DALL-3 models, GPT Image models are autoregressive with several new capabilities including image-to-image transformation, advanced photorealism and detailed instruction following. GPT Image can generate images in three sizes, namely 1024 × 1024, 1536 × 1024, and 1024 × 1536 pixels.
GPT Image 1.5 addresses premature cropping and the warm color bias from the previous model, but it has regressed for generating in some specific art styles. Moreover, the weakness of multiple faces and some languages such as Chinese, Arabic, Hebrew, etc. still remains with the latest model.

Reception

Technology commentators generally regarded GPT Image as significant advances in image generation. TechRadar highlighted that GPT Image 1 delivers impressive performance capable of producing a wide range of outputs from photorealistic scenes to stylized illustrations, noting notable improvements in text rendering and multimodal integration compared with earlier tools. However, Heise Online reported that GPT Image 1 exhibits technical weaknesses such as over-sharpening artifacts, a warm color bias, and common mistakes in rendering human poses and object overlaps, indicating limitations in output realism despite overall strong performance.

Cultural impact

Upon the launch of GPT Image 1 in March 2025, photographs recreated in the style of Studio Ghibli films went viral. Sam Altman acknowledged the trend by changing his Twitter profile picture into a Studio Ghibli-inspired one. The White House's official Twitter account posted a Ghibli-style image mocking the arrest by immigration authorities of Virginia Basora-Gonzalez, a migrant previously deported after being convicted of fentanyl trafficking, which shows her crying as an immigration officer places her in handcuffs. North American distributor GKids responded to the trend in a press release, comparing the use of the filter to its coinciding IMAX re-release of the 1997 Studio Ghibli film, Princess Mononoke.