[00:00] In AI Studio go over to generate media, click Gemini Native Image, and I have this little robot guy I generated for my AI DevEssentials newsletter. I'm gonna drag him in, drop him right there, then I'm gonna copy and paste in a very simple prompt with expression laughter eyes closed. I'll let this run And now we have a nice variation where the robot could be laughing at the news. But if I want to do a lot of variations with a lot of different expressions, then we're going to have to turn to scripting. So in my terminal I'm going to create a new project.
[00:30] I use the take command in omizsh when creating directory and then navigating into it rather than makedir and cd. I'll just call this project expressions. Hit enter and then open this project in cursor. I've aliased the letter C to cursor, which if I don't pass anything to it it just opens the current directory in cursor. So we now have this empty project called expressions.
[00:53] So from here I'm going to open the terminal in cursor. I'm going to run bun init. Bun is simply the easiest way to initialize a project and run it in TypeScript. So this will be a blank project and you can see it's scaffolded out a project with an index.ts file. So what we're going to do is go back into AI Studio and grab get code in the top right, click copy, and if you look at this code it includes our original prompt with a base64 of our uploaded image and then the result so you could include the entire conversation.
[01:25] So I'll click copy and drop this into our index.ts, paste it in, and then we can close AI studio. And then the first thing we'll need to do is we are missing these packages. So we'll install the dependencies with bun I Google slash genai and then space mime and let that run. Or if you want to be super lazy You could always select this, hit command I, and tell any type of agent, please install all the missing dependencies, and that would work as well. But from here the thing that we're missing is this Gemini API key.
[01:56] For example, if we try to console log the process.env gemini API key, and then we'll just terminate the script right after this. And we run this script. I'll clear the terminal and I'll say bun run index. You can see this is currently undefined. So let's go ahead and create a .env file.
[02:15] The main way I create files now is to open the file picker with command P and then I start typing the name of the file and then press down once and hit enter and that creates the file for me. This little symbol means that cursor is ignoring this file so the agents and such will not read it in. And then to assign our Gemini API key. Then we're gonna have to generate an API key. Make sure you are signed in.
[02:36] I am signed into my account here. We'll click get API key and then click create API key from this dialog. Select a project to assign it to. I have a huge list of projects. My top ones are billing and free.
[02:48] I'm just gonna use my free account for now. We'll create this key, then copy and paste this into here, and I'm gonna delete this API key so don't even worry about trying to use it. But all I have to do later is come down here and scroll over and click trash to delete this key I just created, which I'll do after this video. So now if we run bun-index again, this time you can see we have our environment variable in context. Now there are definitely more advanced techniques for storing environment variables inside of one password and other more secure options, which are recommended for maximum security.
[03:20] So I'll close this out, clear out my terminal, and now we can remove process exit here. And now if we let this run, bun run index.ts, it looks like I must have exceeded my quota on the free plan from some other scripts or projects from this morning. So I'll go back into AI Studio and copy and paste one of my paid keys. All right I edited a little part there where I pasted in a paid key. I'll clear this out, run this again, and this time once the script is done you'll see enter file name is the name of our generated image.
[03:52] And it looks like this time it gave him glasses. So to loop through a list of prompts let's do a couple steps first. First I'm going to drag and drop the actual image of our source image into the same directory. And instead of using raw base64, which is here, I'm going to import readfile and writefile from fspromises. So writefile and readfile.
[04:14] We don't need this save binary function. We'll create that ourselves. And actually since we're in a single file we don't need a main function and we can just operate with top level await. And I'm gonna scroll up to the top and the first thing we're gonna attempt to do is await read file and we're gonna read in our ai-dev.jpg and this will be our buffer. Now if we want to convert this to base64 then we can go buffer to string and pass in our buffer.
[04:43] And if we want to convert this to the data that's required we can just say buffer to string. And that way we can get rid of this entire massive base64 string and just say data. And this data will be part of our initial prompt. So this, the text of the prompt, and the image of the prompt are each included in this initial prompt contents. I'm going to get rid of the entire follow-up prompt.
[05:06] This was the original image that was returned of just smiling without the glasses. And I'm going to get rid of the prompt insert input here, which is what must have triggered our glasses last time. So now let's extract the text here to a text variable and we'll say the text is this. That way text will automatically be assigned to this and it'll make it easier for us to loop through these in the future. And then we have to, once we get to saving the file, instead of save binary file we can await write file, and that's all save binary file was doing anyway.
[05:38] And then this is one of those errors where I would highlight it, I'd hit command shift D to bring it over into an agent, hit enter, and let an agent fix that one for us. All right the agent solved it. I'll hit command enter and you can see how it's now properly checking for undefined. So now with everything cleaned up we'll jump into our terminal, clear things out, bun run index.ts, and I know this error I forgot to set the encoding type on this. This needs to be base64.
[06:06] That should solve that. If we run it again. And now this completed and regenerated this file with the same file name. And this time it's just a smiley face. So now one thing I want to do since this is a script I'm just going to select everything, go into the agent, and simply say please add extensive console logging.
[06:24] This is going to be a script and we want to see all of the progress and everything that's happening in the terminal. That way if we accept all of this and run our script again, we now get a lot more information about what's going on, and a lot more information about how everything finished. Now this code did generate some TypeScript issues, similar ones as last time, so we can just highlight this, Command-Shift-D, hit enter, let the AI solve it, Command-Enter, and now all the linting issues are gone, and now we're in a good place to start looping through and generating a bunch of images. So now I can go up to our text right here and select this, start a new agent session, and I'll just say please generate an array of 10 expressions covering the wide variety of human emotions. We'll let that run, and now it looks like the agent is kind of getting ahead of us where it's going to loop through everything, and I'm actually fine with that.
[07:18] It's what I was going to do next anyway. So we'll just keep all of this. And you can see we have a current expression assigned to an emotion. Each current expression will be text, and we are looping through and generating these all at the same time. So again just to fix these errors really quickly, if this still generates linting errors, I could hit command shift D again.
[07:38] Another tip here is you can go into problems, command A copy, command B paste, hit enter, and it should just automatically wipe out linter errors that way. You'll see them slowly disappear and they're gone. So we'll keep all there. Now if we hop back to the terminal we'll clear this out. One thing I want to make sure is that it's generating unique file names so I don't waste money on this.
[08:01] So I'm going to look for where writeFile is, and look for where fileName is being generated. So it does have a fileName based on the index, and tab is suggesting to rename it to expression. I'm fine with that. And I'm actually going to select this line, command-k, please insert a formatted date to the beginning of the file name, because I always like to have the dates on these just so I can sort them easily. We'll accept that.
[08:25] I'm going to delete this one before we even start, and go ahead and run this. And be aware this will cost about five cents or so per image. So once you're doing large batches or variations just be aware there is a cost to it. And we should see images start to generate. Here's our first one, starry-eyed it looks like.
[08:44] Here's our second one, He's sad. Looks like it didn't quite get the V in the newspaper this time. And you can essentially watch what's going on in the terminal. This one will be anger. And it looks like this did actually error out trying to generate the angry one for whatever reason.
[08:58] I'm going to copy the error into our agent, paste it in. You could also with this selected hit command L to add this to the chat. And I'll say please add error handling so that if one generation fails it doesn't block the entire script. And please create a way to track and report in the terminal at the end all of the images that did fail in preparation to attempt to generate them in the future. So let this run, clear out the terminal while that's running, and then with this complete we'll go ahead and run it again.
[09:30] All right so it generated the joy image again. It started on sadness. So here's joy. Here's sadness. Looks like anger worked this time, and I'll just cut to when they're all complete this time.
[09:48] All right, looks like everything completed successfully. And now we have all these images which I could use for marketing in various ways for my AI DevEssentials newsletter. Another trick for previewing images much faster is to right-click on this, say Reveal in Finder, and then anytime with images if you select one, hit the spacebar, and then hit the down arrow, you can easily go through each image to preview them much faster than you could in Cursor itself.