Back to Blog

ChatGPT Image Gen vs. Google Gemini Nano Banana Pro — Real World Image Test

5 min read

ChatGPT’s brand new image model (dubbed "Image 1.5") just hit number one on the LMSYS leaderboard, officially beating out Google's top-tier image models. Coupled with the rumored model updates, this looks like OpenAI's code red response to Google dominating the AI scene recently.

But here’s the thing about leaderboards: they are blind tests. People pick what looks "nicer" at a glance. But for those of us using these tools every single day for actual work, does nicer mean better?

I ran these models through a gauntlet of tests—ultra-realistic portraits, text rendering, character consistency, and complex editing—to give you a no-BS verdict on whether we have a new king of AI images.

Here is exactly how ChatGPT Image 1.5 stacks up against Google’s top model (Gemini Nano Banana Pro).

How Do the Models Handle Hyper-Realism?

The first test is the standard "AI Pepsi Challenge." I wanted to see if the new ChatGPT model could handle a hyper-realistic close-up without looking like plastic.

The Prompt: A hyper-realistic close-up portrait of a Viking shield maiden.

The Result:

  • ChatGPT Image 1.5: This is light years ahead of the old DALL-E 3. It used to take 90 seconds to get a mediocre result; now it's fast, and the skin texture actually looks like a human being.
  • Google Gemini: Slightly better detail. It edges out ChatGPT purely on pixel-peeping quality.

However, this test highlighted a major flaw in ChatGPT’s current setup: Resolution and Aspect Ratio.

Google allows for 1K, 2K, and even 4K upscaling. It gives you almost any aspect ratio you want (16:9, 9:6, 1:1). ChatGPT Image 1.5 restricts you to three options: 1:1, 2:3, and 3:2. And the resolution? It lacks the high-definition crispness you get from Google's 2K options.

If you need high-res assets for production, Google still has the technical edge here.

Can They Actually Render Readable Text?

Historically, AI image generators are terrible at text. They hallucinate letters and invent alien languages. I pushed both models to create an ultra-detailed infographic explaining how LLMs work.

ChatGPT Image 1.5: Honestly, pretty strong. There were some inconsistencies—like weird spacing or shrinking fonts—but by and large, I could read it. It understood the assignment.

Google Gemini: It failed here. While the image looked "crisp" stylistically, the text was garbage. It kept repeating the word "token" over and over and descended into gibberish in the smaller sections.

The Takeaway: If you need text in your image, ChatGPT is currently winning.

How Good Are They at Character Consistency?

This is the holy grail for AI creators: can you take a character and put them in a new scene without changing their face?

I uploaded a reference photo of myself sitting at my desk and gave the prompt: "Show the man in the reference image crouching at the edge of a cliff overlooking a black sand beach, low angle 4K."

ChatGPT Image 1.5:

  • Pros: Great lighting. It nailed the "low angle" prompt perfectly.
  • Cons: It struggled slightly with the editing of the edit (when I asked it to change the time of day later, it messed with my height/proportions).

Google Gemini:

  • Pros: The face was accurate. The body proportions made sense.
  • Cons: It completely ignored the stylistic instructions. It didn't give me a low angle; it just gave me a standard shot.

If I had to choose, I’d give the nod to ChatGPT here. It felt like it adhered to the artistic direction of the prompt better, even if the consistency wasn't perfect.

What is the Limit on Reference Images?

This is where Google started to pull away. I tested the models' ability to handle complex prompts by feeding them eight different reference images (various anime characters and a dragon) and asking for a movie poster containing all of them.

  • Google Gemini: It included all eight characters. It handled the layout like a pro. This model can handle up to roughly 16 reference images effectively.
  • ChatGPT Image 1.5: It choked. It clearly has a cap, likely around 5 or 6 reference images. It dropped characters and couldn't composite the whole team.

If you are doing complex composite work with multiple specific assets, Google is the clear winner.

The Dealbreaker: Censorship and Flexibility

Functionally, the image quality between these two is a coin flip. They are neck-and-neck. But there is one massive reason why I still lean toward Google’s ecosystem: Safety Guardrails.

I ran a simple test: "Create an image of a man standing next to LeBron James."

  • Google Gemini: Created the image instantly.
  • ChatGPT: "Sensitive content restriction." blocked.

I do not trust OpenAI not to restrict the hell out of their systems. We saw this with Sora v2, and we see it here. They are terrified of copyright or deepfake issues to the point where they cripple the tool's utility.

Final Verdict: Which Should You Use?

If we look at these models in a vacuum, image-for-image, it’s a tie. Some generations look better on GPT; some look better on Google.

However, I am sticking with Google (Gemini Nano Banana Pro) for now. Here is why:

  1. Flexibility: I need 16:9 aspect ratios. I need 4K resolution. I cannot work with a tool that locks me into social media squares.
  2. Reference Volume: Being able to dump 10+ reference images into a prompt is a killer feature for complex workflows.
  3. Less Nanny-State Censorship: I need a tool that generates what I ask for, not one that lectures me about public figures.

ChatGPT Image 1.5 is a massive leap forward, and for quick text-heavy graphics, it's great. But for professional power users? Google still has the edge.


Frequently Asked Questions

Is ChatGPT Image 1.5 available to everyone?

Usually, OpenAI rolls these updates out to Plus and Team users first. If you don't see the improved quality yet, you likely need to wait for the rollout to hit your region or account tier.

Which model is better for text generation?

ChatGPT Image 1.5 is currently superior for generating readable text inside images (like signs, posters, or infographics). Google's models tend to produce more visual gibberish when asked to render small font.

Can I edit images inside ChatGPT?

Yes. One big advantage of the ChatGPT interface is the built-in editing tools (powered by Adobe Express in some integrations or native in-painting), allowing you to highlight an area and ask for changes without regenerating the whole image.


If you want to go deeper into builds like this, join the free Chase AI community for templates, prompts, and live breakdowns.