Midjourney version 6 is expected to bring a big leap in quality. It will be released later this year.
According to Midjourney CEO David Holz, the leap from Midjourney’s current version 5 to version 6 will be greater than the leap from version 4 to version 5. Holz did not want to comment on the exact release date, but held out the prospect of a quick launch, definitely this year.
For version 6, Holz promises better text understanding, i.e., image generation will be closer to the prompt and take better note of details in the wording. Text rendering should also be possible, which Holz says is “not that hard” but hasn’t been important to the team yet. But he also says that he has not seen a nice text rendering yet and leaves it open if and to what extent this feature will come.
Asked about the announcement of OpenAI’s DALL-E 3, Holz is “very optimistic” that Midjourney will continue to offer the highest image quality. A first comparison between DALL-E 3 and Midjourney v5 shows that DALL-E 3 isn’t that far ahead in terms of image quality, but it does follow prompts better and can render text.
A web version of Midjourney is still in the works. The new website will be launched in two phases: First, there will be a redesigned version of the current site. This will be followed by a site with image generation capabilities and social features. Holz did not provide a specific timeline until the final version of the site is live.
3D and video are on Midjourney’s Roadmap
Also on Midjourney’s roadmap are features for creating 3D and video. Especially for 3D, Holz is “very optimistic” that things will get good soon.
When he looks at current video games, he says he is surprised at how poor the graphics are and how much generative AI can contribute to the quality. Holz said in the past that he expects video games to be generated rather than rendered in the future.
Midjourney does not plan to release any specific information or demos on 3D generation this year. The same goes for video generation, which the Midjourney team is working on, but which Holz says is probably further away from being market-ready than 3D.