GPT Image 2.0 (gpt-image-2) and Nano Banana Pro (gemini-3-pro-image) are the two most capable general-purpose image models in April 2026. We gave both the exact same ten prompts, covering ads, UGC selfies, bilingual typography, a CV for a real public figure, an exploded wristwatch, and a Ghibli-style anime scene. GPT 2.0 rendered 10 of 10. Nano Banana Pro rendered 9 of 10 and refused the Elon Musk CV prompt with a policy message. Below is every prompt, every output, and a short verdict per round.
The test battery mirrors the format popularised in the r/generativeAI comparison thread, but every prompt here was rewritten from scratch and expanded to ten rounds with aspect ratios specified. GPT Image 2.0 was run through the OpenAI Playground on the gpt-image-2 model at high quality with Thinking mode disabled — i.e. the fast single-pass generation path, not the reasoning loop. Nano Banana Pro was run through Google Flow, Google’s creative generation surface for gemini-3-pro-image. No re-rolls, no retouching. Both models were run on April 22, 2026.
Round 1: 4-Panel Manga Page With Dialogue
Prompt (aspect ratio 9:16): Four-panel black-and-white manga page, vertical layout, screentone shading, crisp ink linework. A young courier in a denim jacket runs across a rain-slick Tokyo alley at night. Panel 1: wide establishing shot of the alley, neon signs reflecting in puddles, speech bubble reads “I’m too late.” Panel 2: close-up of the courier’s eyes, rain on goggles, bubble reads “The package can’t wait.” Panel 3: low angle of her boots splashing through water, SFX “DASH.” Panel 4: she hands a glowing cube to an old shopkeeper, bubble reads “From the future.” Clean panel borders, hand-lettered English text inside speech balloons. Include the page number “p.01” in the lower right corner.
Verdict: GPT Image 2.0 wins. Its dialogue bubbles and the page number “p.01” are cleanly rendered with correct spelling. Nano Banana Pro produces a more atmospheric page but mangles the English text inside two of the four bubbles.
Round 2: Athletic Ad With Product and Tagline
Prompt (aspect ratio 4:5): Full-bleed Instagram-native running-shoe advertisement. A Black woman in her early thirties mid-stride on a wet boardwalk at sunrise, teal running tights, a minimal coral tank top, and a pair of unbranded off-white performance sneakers with a subtle carbon plate visible through a mesh cutaway. Shallow depth of field, backlight from a low sun, water droplets frozen mid-air, skin texture crisp and pores visible. Large sans-serif white headline across the top reads “RUN THE DAWN.” Below it, smaller monospace line reads “04:45 — HER MILE.” Bottom-right corner has the lockup “EDITION 26” in coral. Photography style reminiscent of Annie Leibovitz for Nike, 35mm, Kodak Portra palette.
Verdict: Split. Nano Banana Pro produces a visibly more photographic result — skin pores, lighting, the mid-air water droplets all read as real. GPT Image 2.0 wins on typography, rendering “RUN THE DAWN” and “EDITION 26” cleanly. For a brand team, Nano Banana Pro is the base plate and GPT 2.0 is the typography pass — this is the two-model workflow most agencies are already running.
Round 3: Bilingual Menu With Japanese and English
Prompt (aspect ratio 4:5): Single-page restaurant menu on warm cream paper, minimalist Tokyo bistro aesthetic. Header in large hand-lettered brush script reads “月光ビストロ / MOONLIGHT BISTRO”. Six dish entries in two columns, each with Japanese on the left and English on the right, price in yen. Example entries: “鮪のタルタル / Yellowfin Tartare — ¥2,400”; “カモのロースト / Duck Breast, Charred Leek — ¥3,800”; “抹茶クレームブリュレ / Matcha Crème Brûlée — ¥1,600”. Include an “OMAKASE ¥12,000” line at the bottom in a box. Subtle ink-wash illustration of the moon in the top-right corner. Paper grain visible, small shadow. Clean typography, generous whitespace, no spelling errors in either language.
Verdict: GPT Image 2.0 wins. Its Japanese characters are legible and the English translations match. Nano Banana Pro’s paper texture and ink-wash moon are gorgeous, but three of its six dish lines have garbled kanji or invented dish names in the English column.
Round 4: CV / One-Page Resume for a Real Public Figure
Prompt (aspect ratio 2:3): Clean one-page CV in a minimalist modern layout for Elon Musk. Top of the page shows a circular hero photo, full name “Elon Reeve Musk”, title “Engineer, Entrepreneur, Executive”, and contact line “Based in Austin, Texas”. Left column (one-third width) lists: Education — “Bachelor of Arts in Physics, University of Pennsylvania, 1997”; “Bachelor of Science in Economics, The Wharton School, 1997”. Skills — “Systems Engineering, Manufacturing, Fundraising, Rapid Iteration”. Right column (two-thirds) lists experience in reverse chronological order: CEO of Tesla, CEO of SpaceX, CEO of xAI, founder of The Boring Company, co-founder of Neuralink, co-founder of PayPal (via X.com), co-founder of Zip2. Typography: serif header, sans-serif body. Subtle accent color #1F6FEB on section titles. White background, generous margins. No typos.
Verdict: Largest capability gap of the test. GPT Image 2.0 produced a clean, accurate-looking CV for a real public figure. Nano Banana Pro blocked the request with a named-person policy message. This is consistent with Google’s published guidance on Gemini image policies for real public figures, and it is the single biggest workflow difference if your pipeline includes press, recruiting, editorial, or satirical content involving known people.
Round 5: Exploded Wristwatch Product Shot
Prompt (aspect ratio 2:3): Studio product photograph of a luxury mechanical wristwatch exploded into its components, floating apart on a soft graphite seamless backdrop, dramatic raking light from the upper left. Components visible: sapphire crystal, white lacquer dial with applied Roman numerals, hour/minute/second hands, date wheel, automatic movement with visible rotor and balance wheel (Geneva stripes finish), brass gear train, mainplate, crown, gasket, stainless steel case middle, screw-down caseback, and a navy leather strap with contrast white stitching. Thin label lines point from each component to a small sans-serif caption naming the part. Hyper-sharp focus across all layers, subtle shadows beneath each floating piece. Editorial watchmaking aesthetic. Include a title at the top reading “REF. 1887 — ANATOMY OF A CALIBRE.”
Verdict: Split with an edge to GPT 2.0. GPT 2.0 renders the component labels accurately. Nano Banana Pro renders more convincing metal finishing on the case middle and the rotor, but most of its callout labels are illegible. If you need the diagram to be readable, ship GPT. If you need the frame to sell the product, ship Nano Banana Pro.
Round 6: Ghibli-Style Anime Scene
Prompt (aspect ratio 16:9): Hand-painted Studio Ghibli style anime scene, horizontal format. A small coastal village at dusk, wooden rooftops cascading down a hillside, warm yellow lantern light spilling from the windows, clothes drying on lines strung between houses. A young girl in a navy yukata stands at the top of a stone staircase, holding a white cat. She looks out over a harbor where three ships with red sails are returning home. Orange and lavender sky, soft cumulus clouds, one distant flock of birds. Painterly brushwork, pastel highlights, visible cel shading, grain of hand-painted backgrounds. Warm nostalgic mood. No text anywhere in the image.
Verdict: Split with an edge to Nano Banana Pro. The Nano Banana Pro frame has more of the painted cel-shading quality that reads as Ghibli; GPT 2.0 looks more like a digital illustration with a Ghibli filter. Both get the red sails, the yukata, and the staircase right. If you care about which studio the style evokes, Nano Banana Pro. If you care about composition discipline, GPT 2.0.
Round 7: Hyperreal Human Portrait
Prompt (aspect ratio 4:5): Editorial magazine portrait, medium close-up from the chest up, of a 62-year-old mixed-heritage woman with sun-weathered skin, silver curly hair cut short, warm brown eyes, and a soft unposed smile. She wears a washed indigo linen shirt. Backdrop is a raw concrete wall with a single shaft of late-afternoon window light crossing her face diagonally. Shot on an 85mm f/1.4 lens, photograph in the tradition of Platon’s portrait work. Visible skin pores, fine lines around the eyes, subsurface scattering in the lips, individual hair strands catching the light, a single stray hair in front of her ear. Color palette: deep teal shadows, warm ochre highlights, natural skin tones. No heavy retouching — this should read as a real person, not a beauty ad.
Verdict: Nano Banana Pro wins. This is the category where the gap is most obvious — skin texture, subsurface scattering in the lips, the way individual hair strands catch the shaft of window light. GPT 2.0’s portrait looks like a strong digital painting; Nano Banana Pro’s looks like a frame from a real camera.
Round 8: Silkscreen Gig Poster With Dense Typography
Prompt (aspect ratio 2:3): A silkscreen gig poster, two-color risograph aesthetic, fluorescent red and deep navy on off-white stock with visible paper texture and slight misregistration. Main illustration: a stylized desert highway at night with a vintage convertible, cactus silhouettes, and a huge full moon. Typography hierarchy from top to bottom: small caps header “DESERT ECHO PRESENTS”; giant condensed serif band name “THE LONG DARK ROOM”; medium script “with special guests Neon Coyotes & Sable Hour”; date “FRIDAY, JUNE 12, 2026”; venue “THE HOLLOW ROOM, MARFA, TX”; doors “DOORS 8PM — ALL AGES”; tickets “TICKETS $22 ADVANCE / $28 DOOR”; bottom-right small print “RISO ED. 1 / 150 — HAND NUMBERED”. Everything rendered with the ink-overlap color-mixing quality of a Risograph print.
Verdict: GPT Image 2.0 wins. Every typographic element — the band name, the guest act line, the date, the venue, the ticket prices, the edition number — is legible and correctly spelled. Nano Banana Pro’s misregistration and ink-overlap quality read more authentically as Riso, but four of its seven text blocks are gibberish.
Round 9: UGC Selfie, Phone-Camera Realism
Prompt (aspect ratio 9:16): Vertical phone selfie, front-camera aesthetic, slight fisheye distortion, mild grain, hard overhead kitchen LED lighting. A 28-year-old man with messy dark hair, a two-day stubble, and a faded black hoodie holds up a bowl of mid-looking homemade ramen. The ramen has an overcooked egg, too many scallions, and a clearly frozen piece of packaged chashu. He is pulling a small self-aware grin at the camera. Background: a cluttered apartment kitchen — a pothos plant, a half-empty bottle of soy sauce, a rice cooker with the lid open. Image should look like an Instagram Story snap, not a food advertisement. Intentional imperfection is the point.
Verdict: Nano Banana Pro wins. The phone-camera look — the fisheye, the hard overhead LED reflection on the hoodie, the slight sensor noise — is clearly more natural. GPT 2.0’s version looks too clean, too evenly lit, and the ramen bowl is more stylised than “mid.” For UGC-style creative, Nano Banana Pro is the default.
Round 10: Children’s Storybook Double-Page Spread
Prompt (aspect ratio 3:2): Illustrated children’s picture book double-page spread, warm gouache painting style, soft palette. Left page: a small fox in a red scarf waves goodbye to a sleepy bear at the door of a mossy tree hollow, with the sentence at the bottom in hand-lettered serif reading “Finn tucked the bear in for the long winter.” Right page: the fox walks away through a snowy birch forest, a tiny bird sitting on its shoulder, northern lights swirling above, with the sentence at the bottom reading “Then he stepped into the quiet of the first snow.” Gentle grain of pressed paper, subtle spine shadow down the middle of the spread. No extra text in the scene.
Verdict: Split with a slight edge to GPT 2.0. GPT renders both sentences correctly — that’s a 2-of-2 typography win. Nano Banana Pro produces more convincing gouache texture and a more storybook-feeling palette, but the hand-lettered text on both pages is partially corrupted. If you’ll overlay the copy in layout, Nano Banana Pro. If you need the model to ship finished art, GPT 2.0.
Scoreboard
| Round | Category | GPT Image 2.0 | Nano Banana Pro |
|---|---|---|---|
| 1 | Manga page with dialogue | Win (typography) | Atmospheric but garbled text |
| 2 | Athletic ad | Typography win | Photography win |
| 3 | Bilingual Japanese/English menu | Win | Kanji errors |
| 4 | Elon Musk CV | Rendered | Policy refusal |
| 5 | Exploded wristwatch | Readable labels | Better metal finishing |
| 6 | Ghibli anime scene | Composition | Cel-shading feel |
| 7 | Hyperreal portrait | Good | Win — most realistic |
| 8 | Silkscreen gig poster | Win (typography) | Riso feel, text errors |
| 9 | UGC ramen selfie | Too clean | Win — phone-camera real |
| 10 | Storybook spread | Clean text | Better gouache |
Rendered, in total: GPT Image 2.0 10/10, Nano Banana Pro 9/10.
The Policy Gap Is the Story
The single most consequential result was Nano Banana Pro refusing to render the Elon Musk CV prompt. The exact message returned by Google Flow was:
“This prompt might violate our policies about generating prominent people. Please try a different prompt or send feedback.”
GPT Image 2.0 produced the CV without comment. In April 2026 this is a real product decision, not an abstract ethics note — if your pipeline touches press, political satire, biographical content, recruiting flyers, editorial illustration, or podcast cover art featuring known figures, you cannot rely on Nano Banana Pro as a single-model workflow. OpenAI has its own policy layer on gpt-image-2 (it will, for example, refuse some violence, sexual content, and copyright prompts), but its named-person policy is noticeably more permissive than Google’s in this test window.
Which Model For Which Job — Short Version
If you need in-image typography, multilingual menus, CVs, slides, posters, or editorial dialogue, default to GPT Image 2.0 (gpt-image-2).
If you need photoreal portraits, UGC selfies, skin texture, natural lighting, or ads that read as photography, default to Nano Banana Pro (gemini-3-pro-image).
If your work involves real public figures, you will almost always have to use GPT Image 2.0 — Nano Banana Pro’s policy block on known people is currently the cleanest capability line between the two models.
For most professional teams, the answer is not a single model. It is a two-model workflow: Nano Banana Pro for the photographic base plate, GPT 2.0 for the typography pass, and a compositor (Photoshop, Affinity, or a React canvas layer) that layers the two. That’s the pipeline we run at AI Video Bootcamp, and it is what we recommend to any agency building production creative in April 2026.
Methodology Notes
All 19 images were generated on April 22, 2026.
GPT Image 2.0 was run via the OpenAI Playground on the `gpt-ima