How to Create a Viral “Stadium” AI Video — A Complete Guide

How to create a viral “stadium” AI video for TikTok and Reels. A step-by-step guide to making a realistic stadium meme using GPT Image and Kling 3.0 with a face reference.

Автор: Alina Dudnikova
Alina Dudnikova··4 min

Share

What is this trend?

One of the most popular TikTok/Reels trends in recent months is videos where a person sits alone in a stadium, seemingly watching everything from the sidelines. These videos are usually accompanied by captions like:

“Me watching my life fall apart in real time”
“Me watching how…”
“Sitting in the stands watching how…”

The whole vibe of this trend is built around the feeling of a random shot from a live broadcast: the person looks completely ordinary, isn’t posing, and seems to have been accidentally caught by the camera during a match.
The key element of this format is realism. The video should look like a real sports broadcast clip, not an obvious “AI-generated video.”

Step 1. Prepare a Face Reference

For this trend, it’s best to use a reference photo of a real person. This helps maintain realism and keeps the character consistent.
The best options are:

  • a regular selfie
  • a photo with a neutral facial expression
  • good lighting
  • no heavy filters

The main purpose of the reference is identification, not style.

Step 2. Generate a “Stadium Stands” Image with GPT Image

This prompt works especially well with GPT Image 2.0:

Ultra realistic candid stadium photo, a young person sitting alone in the stadium stands during a football match at night, casually sitting with one leg crossed over the other, relaxed posture, natural facial expression, slightly looking toward the field, maybe barely noticing the camera, authentic “caught on live TV” feeling, cinematic sports broadcast atmosphere, realistic arena lighting, shallow depth of field, crowd blurred in background, no posing, no AI-generated look, feels like a random normal person who accidentally became a viral meme after being shown on TV, documentary photography style, telephoto lens, soft stadium lights, highly detailed skin texture, natural imperfections, realistic clothes, subtle emotion, atmospheric, viral candid moment
Use the uploaded reference image ONLY for facial identity and appearance consistency.
Do not stylize the face.
Keep the expression neutral and natural.
slightly grainy broadcast camera quality, live sports TV screenshot aesthetic

Step 3. Animate the Image in Kling 3.0

After generating the image, you can use it as an image-to-video reference in Kling 3.0.
The key rule is not to add too much motion. The whole point of the trend is naturalness and the feeling of a random moment.
Here’s a working prompt for Kling:

A hyper realistic cinematic video of a young man sitting alone in the stadium stands during a live football match at night. He sits calmly with one leg crossed over the other, relaxed posture, naturally watching the game. His facial expression is neutral and authentic, as if he barely notices the camera or maybe does not notice it at all. The scene should feel like a real accidental TV broadcast moment that later became a viral internet meme.
The camera slowly zooms in from a distant broadcast angle. Subtle natural body movement, blinking, breathing, tiny head movements. The stadium is full, crowd blurred in the background, bright stadium lights illuminating the arena realistically. Real sports atmosphere, documentary-style broadcast aesthetic, telephoto lens look, shallow depth of field, realistic skin texture, natural imperfections, realistic clothing folds and lighting.
IMPORTANT:
Use the reference image only for facial identity consistency.
Do NOT make the character look AI-generated or over-stylized.
The final result must look like authentic live TV footage captured during a real match.
Style: ultra realistic, cinematic, live sports broadcast, viral candid moment
Motion: subtle realistic human movement only
Camera: slow zoom-in, handheld broadcast micro-movements
Lighting: realistic stadium night lighting
Quality: photorealistic, natural colors, immersive atmosphere

Why these videos go viral

The main reason for their popularity is the sense of realism. People feel like they’re watching a real broadcast clip rather than an AI-generated video. That’s exactly why these videos often get millions of views on TikTok, Reels, and Shorts.

Recreate a popular TikTok/Reels trend using GPT Image and Kling 3.0

Discover more

View all