Midjourney recently added a new vary (region) option that allows inpainting in the tool. We’ll get to why inpainting is an important new feature shortly, but enabling it opens up a whole new set of questions about Midjourney settings. So, today, enjoy the Basic Answers to Basic Questions about Midjourney edition of Explainable.
First, Discord
With apologies to those already using Discord regularly, it’s worth acknowledging that it can be intimidating to first-time users. Discord is where you’ll get to grips with tools like Midjourney, Stable Diffusion and Runway. Each tool has its own server and then other servers have integrations for many tools. Once you’re in, toggling from one tool to another is simple.
The step-by-step guide from Midjourney is pretty straightforward, but the very basics are:
Once you’re set up with a Midjourney log-in, join a newcomer room. They’ll be busy, but it’s useful to see in real-time how others are developing their prompts
You set up a prompt by writing the /imagine command, and then writing whatever prompt you have in mind
You will be served with four versions, you can upscale any one (U1, U2, U3, U4). Or you can iterate another four examples from one particular version (V1, V2, V3, V4)
And once you’re down to a single selected image you can vary (subtle) or vary (strong) depending on how ‘extra’ you want the thing to look
And then, finally, there are options for zooming out and panning across the image
OK, back to settings.
All the hype around prompts has tended to suggest that a well-crafted phrase is all that’s needed to generate the image you need. But it obscures the role of settings within Midjourney. Getting a grip on settings and learning the subtle differences brought about by a change of mode is as important as prompt crafting.
Entering the /settings command brings up this menu:
I’ll break it down one by one:
Model
’Use the latest model (5.2)’: Maybe an earlier version produced better results for some specific prompts, but it’s unlikely you’re going to want to change from the newest version. Still, if you’re hitting a brick wall with a project, Midjourney outlines the unique quirks of past models here.
Raw Mode
This allows you to “reduce the Midjourney default aesthetic”. In short, results in Raw Mode are likely to feel a little less Midjourney-ish. Particularly important now an obvious Midjourney aesthetic is in cliché territory.
Stylize parameters
Ranging from low to very high, the more artistic you want it the higher you get. On low the results will adhere more closely to the prompt, but will be less artistic. On very high you get the most stylized results, but with a looser connection to the prompt. Ask yourself, are you requesting something from an engineer or an artist? That should guide your stylize setting.
Public or Stealth mode
Only applicable to anyone with the Pro or Mega plan. And it just allows users to stop their work from being visible on midjourney.com. They’ll still show in Discord, unless using a private server or DMs.
Remix Mode
This allows users to edit prompts between variations. Without using remix variations, reiterations might go in the trajectory you want, but it’s out of your control. This allows users to build on the initial image or change an element without going back to square one.
You can toggle on and off in /settings and/or using the /prefer remix command.
Vary (Region)
The big one! Recently introduced by Midjourney (example above). This is where remix mode allows for in-painting. Once you have selected and upscaled an image you can now select ‘vary (region')’ and edit a specific area of the image using a freehand or rectangular selection tool. It allows for a level of focused reworking that wasn’t previously available.
High and Low Variation Mode
A lot will depend on the project. Looking for widely varying concepts at an early planning stage? High variation. Want to explore subtle differences within a very specific brief? Low variation.
Turbo, Fast, and Relax Mode
Depends on how quickly you want the job. Use Turbo too often and, depending on your monthly plan, you could run out of minutes. Relax mode doesn’t eat into your GPU allowance, so is useful if you’re not in a rush. It still should take no longer than 10 minutes.
And that’s it. The basics. Nothing revolutionary but taken together they can be as important to quality as a well-crafted prompt.
Small bits #1: The big gen AI problem
As well established by any social media searches for Stable Diffusion images, the text-to-image revolution is following a similar path to many previous media revolutions, and that involves plenty of misogyny and porn. Rachel Metz, Bloomberg’s AI correspondent, recently shared an example of that from a Twitter Blue Tick who seems to be attempting to troll his way to his first million.
Small Bits #2: Noonoouri is nothing new
As recently explored, AI models, singers and influencers are, so far, examples of cynical marketing, not major AI breakthroughs. This, of course, has not stopped news that Noonoouri, a virtual influencer knocking around Instagram for years, from grabbing headlines as the ‘first AI pop singer to land a major record deal’. This is not innovation! It’s just Crazy Frog with, somehow, less musical integrity.
Small bits #3: Small progress
Good news from last week, Google DeepMind introduced SynthID. Described as ‘an imperceptible digital watermark for AI-generated images’. It’s in beta, and the issues over establishing an industry standard remain, but the technology looks impressive, particularly the ability to retain the watermark after screenshots and edits. On a related note, here is a great deep dive Twitter thread on generative AI and the ongoing copyright law debate.