AI-based image generators are hugely popular at the moment, with many options to choose from to scratch that creative itch. Some require a PC equipped with a high-end, expensive graphics card, but Midjourney stands out from the crowd as it needs no specialized hardware at all.
Midjourney runs entirely from a Discord server, opening up the AI art world to millions of people. If you want to join them, continue reading our quick guide on using this fantastic tool.
Getting started
The first thing you will need is a Discord account – you can download and use the dedicated app or simply do it via your web browser. That means you can use Midjourney on most devices running Windows, macOS, Linux, iOS, or Android.
Once you are on Discord, join the Midjourney server.
Essentially, what you're going to do is "chat" with the Midjourney bot. All messages, for generating images, start with /imagine prompt. The example you can see below was created by sending: /imagine prompt astronaut on a horse to the bot.
Once you're in, you'll see a long list of channels, but for now head for one of the Newcomer Rooms. Here you will see other members making prompts and the resulting images. It will give you a first glimpse of the platform and let you become familiar with the tool.
There used to be a free trial for new users to experiment with Midjourney, but due to the immense demand it placed on the servers, this has been removed. As a result, you'll need a subscription plan to access everything. Occasionally, a free trial is offered, most notably when a new AI model is released, but it's relatively infrequent. We'll revisit the topic of subscriptions later, and for the remainder of this guide, we'll assume you have the most basic plan.
Midjourney always generates 4 images from the prompts and gives you three options:
- Redo the whole process to get a new set (the blue double-arrow button)
- Upscale one of the four pictures (the U1, U2, U3, U4 buttons)
- Use one of the four as a starting template for another run (the V1, V2, V3, V4 buttons).
It takes a couple of minutes to compose images; however, depending on the server's workload, there might be a longer pause before you can receive your results. With so many people in the channels, your results will quickly scroll out of view, so be sure to keep an eye out for them.
Using commands and prompt parameters
The AI image generator supports a number of commands and parameters to adjust what and how it manages the process, most of which you add after the descriptive text. You don't need to use these, as by default, Midjourney will use the latest public model (version 5.1) and create 512 x 512 pictures (older models are 256 x 256).
The differences between the various models lie mainly in two aspects – how effectively they interpret the prompt and how accurately the generated image corresponds to the input text (a property known as coherency).
The more recent Midjourney versions (4, 5, and 5.1) all have very high coherency, though version 4's strength is with image-based prompts; version 5 and 5.1 are extremely good at handling natural text prompts.
Version 3 has moderate coherency, and the others are all rated as low. That might sound like a bad thing, but if you're after a creative/abstract interpretation of your prompt, then the older models are perfect for this.
Model parameters
- --version or --v will change which deep learning model gets used. Leave a space after the last letter and then add 1, 2, 3, etc.
- --style can be used to tweak model version 4 (e.g. --style 4b) if you're looking for further variations on the same theme
- --niji is another model, but one that's been trained on lots of anime illustrations and art
Image parameters
- --aspect or --ar will change the ratio of width to height of the images. The default is 1:1 but 3:2, 16:9, 16:10, and others are supported.
- --chaos [0 to 100] will alter how 'creative' the images will be, with higher values giving you increasingly more oddball results
- --stylize or --s [0 to 1000] adjusts how closely the generator sticks to the prompt. The default value is 100 in model v4, and higher values produce images that are less bound to the descriptor.
- --quality or --q [0.25, 0.5, 1, or 2] controls how much GPU processing time is allocated to making the image. The default value 1, so if you want to save some precious Midjourney usage time, while experimenting with prompts, using a lower value is really key. The value of 2 only applies to model version 1, 2, and 3.
There are more commands and parameters to explore, you can head over to the Midjourney website for further details.
Using images instead of prompts
Midjourney can also use images, instead of text, to create new pieces of art. Instead of typing /imagine, use /blend and then upload up to 5 pictures from your storage drive into Discord.
The blend function works best with images that have an aspect ratio of 1:1, but will work with others - the end result will be square, but applying --ar 2:3 or --ar 3:2 at the end will give you a portrait or landscape picture.
If you want more control over the output, you can use up to two images with a text prompt. Place the full URL to an online picture, between /imagine prompt and the descriptive text and Midjourney will get the file for you.
Alternatively, you can upload the file yourself into Discord and then drag it into the message field - do note that everyone else in the newcomer room will be able to see that image.
This example was created with another staff member's picture, followed by a prompt of 'strongest avenger --ar 16:9'. Spend some time in any of the newcomer rooms and you'll see that blend and the image prompt system are both popular tools.
Prompts examples and Midjourney's output
As with all AI image generators, the quality and accuracy of the output images (against your expectations) will come down to the prompts you use. There are plenty of sites with prompt examples you can follow, but here are some of our own tips to give you an impression of what's achievable.
A time traveler shows what a "selfie" is
Prompt: A group of male Norse, Dane, and Vikings huddled together and is taking a group selfie picture together in 793 CE. They are drinking ale at a feast in a Viking longhouse. They are all wearing traditional Viking armor and helmets. Everyone smiling directly at the camera. The image is photorealistic, has natural lighting, and is taken with a front-facing phone selfie camera by one of the Vikings. --ar 3:2 --s 1000 --no phone --v 5 --q 2
Fantasy landscape
Prompt: fantasy landscape, atmospheric, hyper-realistic, 8k, epic composition, cinematic, octane render, artstation landscape, vista photography, 16K resolution, landscape veduta photo, 8k resolution, detailed landscape painting, DeviantArt, Flickr, rendered in Enscape, 4k detailed post processing, artstation, rendering by octane, unreal engine --ar 16:9 --v 4
Cyborg bikini model
Prompt: imagine a cyborg bikini model, facing the camera, she is very tall standing 100 meters high above much smaller buildings, 35mm film, --ar 16:9
Dolce & Gabbana Portuguese man
Prompt: Street style fashion photo, full-body shot of a Portuguese man with black hair and full beard, walking with a crowd of people on a sidewalk in Dubai while holding his leather laptop case, wearing a royal blue Dolce & Gabbana blazer and white button up, sunset lighting --ar 9:16 --stylize 1000 --v 5
Roaring Elon Musk
Prompt: Elon Musk dressed in skin-tight leopard print with a leopard scarf and a walking cane, inviting you to get you pretty lil ahh in the car as he waves you into his Cadillac Escalade
Extreme graphics card
Prompt: highly detailed photo of a graphics card in a powerful PC, bright colors, RGB lights, lots of cooling fans, glass panels, high resolution, ultra-detailed, vivid colors, neon lighting, dark background, flood light, radeon, geforce, ryzen, water cooled --ar 16:9 --v 4
Moose painting
Prompt: megan duncanson style painting, bull moose huge antlers with a snow capped mountain range, lake with reflection in background, early stages of sunset, psychedelic effects --ar 16:9
80's retro
Prompt: [two 80's looking photos added as prompt, plus...] Scene: 80's neighbourhood coming of age lighting: natural, slightly cinematic, hot summers day autobiographical VISUAL STYLE: photorealistic photograph perspective is two-point and scene has a crisp, film-photography feel style of Martin Parr, composition style of david hockney CAMERA: Stationary, Hasselblad LENS, 120mm, film stock: cinestill 50d and porta colours: natural warm tones RESOLUTION: High Definition but grainy vintage TIME OF DAY: late afternoon early evening – ar 4:3 --no busy
Photo-realistic woman
Prompt: a photo-realistic full-body portrait of a beautiful woman with blonde hair standing in a flower field. The image should be shot in a backlighting scenario during the golden hour. Please use a 50mm lens on a medium format camera to achieve a cinematic look. The colors should be rich and vibrant, with a focus on Hasselblad-style tones --ar 16:9 --v 5
Interior of a room
Prompt: photograph of the interior of a living room, large mirror on the wall, flowers in a vase, cream walls, pastel palette, clean style, soft lighting, minimalistic, hyper-realistic, high resolution, 4K detail, rendered in Octane, warm lighting --v 4
Magical golden dragon
Prompt: a cute magical flying dragon, fantasy art drawn by Disney concept artists, golden color, high quality, highly detailed, elegant, sharp focus, concept art, character concepts, digital painting, mystery, adventure, cinematic, glowing, vivid colors --ar 16:9 --v 4
White Porsche
Prompt: white porsche 917, dotonbori osaka in the background, night, fine art cinematic automotive photography, ultra hyper realism --v 5 --s 250
Simple B&W photograph
Prompt: black and white photograph of a tree, dark background, high resolution, flood light, sharp contrast --ar 3:2 --v 4
Native Americans, a welcoming smile
Prompt: A member of the Kaingang tribe stands proudly in a portrait, their traditional clothing and intricate beadwork adding to the beauty of their warm smile and gleaming white teeth as they pose for the camera. In the background, a stunning valley of Araucarias creates a photorealistic backdrop for the Kaingang's traditional way of life. The sun filters through the trees, casting a warm glow on the Kaingang's face and illuminating the intricate details of their clothing and jewelry. --v 4
Aged cyberpunk samurai
Prompt: realistic style painting of an old male samurai, cyberpunk city in the background, vivid color contrast, rich, highly detailed, futuristic, cityscape, night time, bright light, striking pose, high resolution, volumetric, ultra-detailed, 4K --ar 16:9 --v 4
Vivid Auroras
Prompt: Vivid Auroras Around Jupiter in Photography Style with a Telephoto Lens. The scene features Jupiter with its colorful auroras in the background, and some of its moons in the foreground, against a starry space backdrop. The color temperature is cool with shades of blue, green, and purple. Jupiter has a majestic quality, and the auroras are vibrant and dynamic. The lighting is mostly coming from the auroras, creating a mysterious and otherworldly atmosphere. The lens size is 300mm:: anime::1 crepuscular rays::1 --ar 16:9 --stylize 1000 --v 4
A giant octopus
Prompt: A digital art piece of a giant octopus made of jelly beans, attacking a city skyline --v 4
Shoal of fish
Prompt: underwater photograph of a shoal of fish, blue ocean, light beams, bright colors, realistic, high resolution, high detail, tuna, mackerel, swordfish, jellyfish, shark, caustics, refraction, high dynamic range, pacific, coral reef, 4K --chaos 10 --stylize 500 --ar 3:2 --style 4b
Mandelbulb garden
Prompt: glitter encrusted codex seraphinianus mandelbulb garden :: colorful remedios varo garden with many animals --ar 5:3 --v 5
Going further with Midjourney
Because Midjourney no longer offers an open free trial, you pretty much have to get a subscription plan and the first advantage offered by this is that you can use Discord's direct messages with the bot, instead of the (very) public channels.
This means any images you make or upload are hidden from the public, though they are still moderated and checked by Discord and Midjourney.
If you're serious about exploring AI art or need to use it in a professional situation, then subscribing makes a lot of sense. The GPU power required to do this work doesn't come cheap and the prices reflect this.
Subscribers can potentially earn processing time, simply by rating their and other users' pictures, although only the top 1,000 raters from each day will get an extra hour of GPU time.
Like all AI image creators, Midjourney can be a lot of fun to spend time with, but its ease of use and output quality sets it aside from the likes of Stable Diffusion. Considering that you can enjoy the program on any device that can run Discord, Midjourney is likely to be a permanent landmark in the artificial intelligence field.