NLW
๐ค SpeakerAppearances Over Time
Podcast Appearances
If there is an important move of the camera, just like an important subject action, they suggest only doing one.
Lighting recipe, one way the scene is lit, which could be about where the light is coming from, whether it's natural light or something else.
And of course, your one dialogue or sound.
Sora videos don't have to have talking, although most of them have opted for that, so you can give sound or music cues in addition to just dialogue.
I'll show you one example where I didn't have a specific dialogue, but I did by implication have sound instructions.
But what if you're saying, look, I like the creativity of AI and all of those things make sense, but I have a vision for something much more comprehensive, either because I'm trying to actually make a short film or maybe because I have really particular requirements where I'm trying to accomplish something for my marketing material, for my product shots.
Luckily, OpenAI says that Sora is capable of handling ultra-detailed prompts as well.
They write, for complex cinematic shots, you can go beyond the standard prompt structure and specify the look, camera setup, grading, soundscape, even shot rationale in professional production terms.
This is similar to how a director briefs a camera crew or VFX team.
Detailed cues for lensing, filtration, lighting, grading, and motion help the model lock onto a very specific aesthetic.
So the example they give has a slew of sections before they get into what's actually happening in the video.
They have a format and look section, digital capture emulating 65mm photochemical contrast.
They have a lenses and filtration section, a grade and palette section that gets into the highlights, mids, and blacks.
a lighting and atmosphere section that's separate from that, natural sunlight from camera left, low angle, a location and framing section that has the general urban commuter platform at dawn, and then a breakout for foreground, midground, background.
They also importantly have negative prompts here, avoid signage or corporate branding.
Then the last two are wardrobe props and extras, describing the characters and people who are going to be in the video, and finally the sound.
And that's all before you get to the shot list, which again, this is a four second video.
You'll note that with this shot list, it is not just the sequence, they're also giving it timing.
This is something that I'm starting to see a lot on Twitter slash X as people share their best results, is that providing a shot list with specific timestamps really helps the model adhere to what you're going for.
Finally, they have a few additional camera notes and how to finish up, but this is the level of complexity that you can get to.