manfred feiger

Midjourney and DALL-E 2 – first hands on impressions

DALL-E 2 – midjourney-image explorations

Published: August 3, 2022
Reading time < 10 minutes
Categories: | | | | |
2022-08-03T13:27:10+00:00

AI-generated artworks are going mainstream and the two open tools midjourney and DALL-E 2 give us an idea what's next in image generation from text prompts.

Maybe you remember my article speaking computer language – welcome AI art, speaking about the new role of text prompts and different image AI assistants. I focused on a general overview on GANs, DALL-E and DALL-E Mini, as an access to DALL-E 2 wasn’t available yet for me. Also, Google Imagen being mentioned in the article has no free option yet… but with only one and a half month passing since my last article, Google also introduced another tool, called Parti, a tool being said to produce the best results of any AI art generators so far and Stable Diffusion is also there to launch its service soon.

Parti Website Screenshot

In this article I want to share my first hands on experience with Midjourney and DALL-E 2 with you. With midjourney moving to open beta and starting their v3 image generation algorithm on the 25th of July, another player pushed to the market… and I guess one or two days later, finally my invite for DALL-E 2 arrived.

At that day I was happy at first, but when I noticed another day later, that another E-Mail I used for signing for DALL-E 2 only a few months ago, despite the first one, being used for application far more than half a year ago, I was a bit disappointed. I thought it’s all connected to their new pricing model and the beta start of midjourney and I got disappointed a bit. The exclusiveness was gone and “mass-market” arrived.

In general the market on AI Art bots seems to push a lot. So for example DALL-E mini, the free tool I mentioned in my other article, is now called craiyon. There are also other tools Worth being mentioned in the context of AI art tools, such as Latitude Voyage (of course in beta) or Apples GAUDI, a prompt tool for indoor 3D scenes.

Luckily I heard the U.S. Copyright office said A.I. Art can't be copyrighted, so I hope no war will heat up this exiting market in the near future. So far I only know that there are patents on algorithms being used. Hopefully the market will show even more diversity in the future.

Midjourney ai tool
Midjourney Screenshot Website

Midjourney beta – first impressions

Let us first look at our new player. Midjourney. The first impression of midjourney is being geekier and less serious than DALL-E 2. It’s a more playful entrance to a world everyone could test right away. After joining the beta, you will be redirected to Discord, no waiting list, no exclusive clubbing.

Before we investigate discord, a quick explanation of who Midjourney is. Midjourney is an independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species.

IN 10 YEARS, YOU’LL BE ABLE TO BUY AN XBOX WITH A GIANT AI PROCESSOR, AND ALL THE GAMES ARE DREAMS.

David Holz in an Interview with the verge

David Holz, the founder of midjourney, did some other geeky, funny things in the past. He used to be co-founder of leap motion, a great exploration of the power of gestures. Midjourney is a small company right now with around 10 people working there.

Exploring the background of midjourney and all their way, how they speak to their audience, in other words their tonality feels great to me. It has the explorative spirit and playfulness I always try to teach students to get started in digital.

Signing up for a free account, you get 25 credits, these can be used to generate images within the discord channel. For anyone being interested, it is even possible to run midjourney on ones own discord server.

After that, you’ll have to pay — either $10 or $30 a month. The pricing is connected to the number of images you want and the privacy you expect. And if you are a huge company with more than 1 billion revenue a year, you get the corporate membership for 600 US dollar a year.

So how does it work?

Screenshot Discord Channel midjourney

If you never used discord before, see it as a chat tool, kind of a slack for the masses, with a non-business focus.

So, in the channel you will find all the information there are in general, on announcements, updates and so on. If feels like you as beta tester are part of the team, and even the boss is there, and you could influence the development process. After all, that’s what a beta program is about.

In terms of my professional perspective being more experienced with product and user testing, this is a great way for that kind of product, as the product itself is being used in the same environment.

So here are the steps to create images:

  1. Look for a newbies Channel and enter it (in the picture you see newbies-29, newbies-59 and newbies-89)
  2. If you have a picture in mind, about what you want to see, start typing “/imagine”, followed by whatever you want to see.
  3. Within 60 seconds you should see 4 images.
  4. With the 4 images being generated, you could upscale the image or create variations. Therefore, there are U1 to U4 (für upscaling) and V1 to V4 (for version) below the images being generated. Upscaling means that the images scale up to around 1664x1664 Pixels
  5. Save the result, click on the generated image (to enlarge) and with the right mouse button being pressed, select save image to your local computer.
  6. Finally, you could also ask the bot to send the results to your Direct messages, by reacting with the envelope emoji.
Prompt dialogue in discord (taken from quick start guide)
Example from discord stream with 4 generated images
Prompt dialogue to use envelope (taken from quick start guide)

This is enough to create visual beauty.

If you need inspiration, look in midjourney ressource list, there are great links:

The artist visual style encyclopedia (as of Mai 2023: link not working anymore)

The artist visual style encyclopedia – if the style is part of your text prompt it will take effect – such as cyberpunk city, by Franz Marc:

My result on cyberpunk city, by Franz Marc on midjourney
My result on cyberpunk city, by Franz Marc on midjourney

There’s also a prompt builder to guide your inputs and see which variables are available:

Prompt guide

And a collection on  “prompt styles and weights”.

All in all, the learning curve, if you are used to discord and used a text prompt before is very flat. Only if you are new to discord and text prompts, you need more orientation.

After all I could say, I loved it. I love the playfulness of the whole app, the results and all the “social” aspects, seeing what all others are building, getting notifications about own results and the whole way it is being setup. It is great fun to try and the documentation is very helpful.

DALL-E 2 AI tools
DALL-E 2 Intro screen

DALL-E 2 beta – first impressions

With the magic of the new and little frustration about the almost open invites, I waited to test my DALL-E 2 invites after I tested midjourney.

As mentioned, entering DALL-E app is a completely different experience. It is you and the text prompt, not the hectic news stream and the dynamics you experience in midjourney. Based on the interface/overall app decision, one could say, that DALL-E 2 wants you to reflect about what you type, while midjourney is a competing social feed, about who creates the coolest results.

The cost system at DALL-E 2 is based on credits. You get 50 credits for the first month for free and 15 credits every month after. With only 15$ you could buy additional credit packs worth 115 credits, which equals around 460 images. OpenAI allows commercial use of DALL-E images. Compared to the image stock market, the costs of the images will be fairly low, but one must be award that you have to get used to the text prompts and the corresponding structure of the input.

That being said let’s try to get a result for our cyberpunk city, by Franz Marc.

In the case of DALL-E, I couldn’t find a quick hands on, official documentation; so, I tried the sentence as following: A cyberpunk city in the style of Franz Marc, as the example being shown by DALL-E on their homepage also used the "in the style of" syntax.

My result on that one was quite… disappointing. I don’t know why DALL-E missed the style of Franz Marc that much:

DALL-E 2 result on a cyberpunk city in the style of Franz Marc

To investigate further, I needed a prompt guide for DALL-E 2. Googling for some help, you find

  1. The DALL-E 2 prompt book – my recommended choice
  2. OpenAI DALL-E 2 prompt guide

Going through the prompt options, you notice, that there are more parameters than on midjourney. Lightning, Camera position/angles, genres, usage contexts and many more.

So, I gave it another try, changing the prompt a bit leading to a much better result in terms of the style:

DALL-E 2 result on a painting from Franz Marc showing a cyberpunk city

Unfortunately, my tries with real artists didn’t work out so good with DALL-E 2 at the start, so I had another try…

My result from midjourney:

Midjourney result

And the same prompt from DALL-E 2:

First result on DALL-E 2

Getting used to the prompt dialogue on DALL-E increases the results a lot:

Adjusted text prompt

Using the right prompts is the essential part to get the best results. If your prompt is done right, it seems DALL-E is more precise, though midjourney is so much fun to use.

For DALL-E there's even a paid service being launched called promptbase to help you with using the right prompts:

Screenshot from promptbase.

If you want to discover more, I recommend searching the web yourself. On twitter I found some comparisons, having a similar impression I do:

From twitter

Resume

Is it possible to do a final resume on the tools? I guess no. On the one hand I am not experienced enough with both tools, on the other hand I think both tools are great and give us a glimpse into the future. The documentation on midjourney and the entry barrier is lower, so to get started, I would recommend midjourney in any case. The more you are aiming for precise results (as an alternative to stock images), the more it feels DALL-E 2 is the way to go.

Will be exiting what the next two month offer in terms of AI generation tools. 2022 is very exiting so far in terms of AI art generators and it will be exiting to see what's next.

What's next? Update mid August – dreamstudio arrived

As mentioned, the speed new, similar offerings are popping up on the market is impressive. It seems, AI in combination with images is going mainstream. Another tool to check out is dreamstudio. Access is quite easy as well and the results are also impressive. Also the editor looks promising, see my screenshot below. After my family holiday I might try a daily use check of all those great tools.

Screenshot dreamstudio and results on "A dream of a distant galaxy, by Hiroshige Utagawa, matte painting trending on artstation HQ."

Recommended readings

  1. The AI that creates any picture you want, explained – The process behind DALL-E 2 (and others) explained
  2. What AI means for human artists – great collection about views from the artists/designer perspective
  3. Making AI Art with Midjourney – great article with examples and explanations
  4. The Creativity of Text-based Generative Art – paper on text-based generative art
  5. The trouble with DALL-E – Five thoughts on the use of artificial intelligence to accelerate the digital art market
  6. My follow up article on Stable Diffusion: Stable Diffusion celebrates new forms of creativity
Generative Art – how to get started with AI Art
The world of art is a diverse and beautiful place. The more we explore new forms of art, the more we are

How AI Assistants Can Help You with Text and Art
Have you ever wished you had a personal assistant to help you with your writing or with your artworks? AI

Speaking computer language – welcome AI Art
The command prompt: A black, empty screen and a blinking line, waiting for my input. Could you imagine

4 comments on “Midjourney and DALL-E 2 – first hands on impressions”

Leave a Reply

Your email address will not be published. Required fields are marked *

More Posts