AI with image recognition opens up new possibilities for designers and developers to quickly turn an idea into a prototype. There are several approaches based on OpenAI technology.
The introduction of multimodal capabilities in GPT-4 has laid an important foundation for future software development. Thanks to GPT-4V, the AI model accepts both text and images as input. This allows it to generate working code from screenshots or rudimentary drawings.
Recently, several products have been developed around this idea. The collaborative whiteboard tool tldraw has set up a playground on the website makereal.tldraw.com, where mockups of website elements can be created in the browser. GPT-4V converts these into code using the OpenAI API. A separate API key is required.
Makereal.tldraw offers a playful drawing interface for code generation
The developers provide background information in a newsletter. The “Make real” feature can be traced back to a Figma employee who built a prototype using GPT-4V based on tldraw code. Sawyer Hood’s GitHub project quickly attracted a lot of attention.
The project also served as the basis for the later “Make Real” application and functionality. However, the final project was integrated directly into the drawing area, allowing for further editing. Written comments and suggestions for improvement can now also be sent to the AI.
“Like a chat window, the canvas is transformed into a conversation space where you and the AI can workshop an idea together. In this case, the result is working websites, which is amazing enough, but it could work just as well for anything,” the developers write.
Alvaro Cintas writes that he made his single-player Pong clone in 30 seconds.
Screenshot-to-Code writes code along a screenshot and generates the corresponding images in one step
Like tldraw’s Make Real, another open-source project, has a similar goal and uses GPT-4V to achieve it. The “screenshot-to-code” repository by Abi Raja uses a screenshot to generate the code to program the interface visible in the screenshot. It also uses DALL-E 3 to generate the images shown on the screenshot to get even closer to the original. The video below shows the tool in action.
The project currently supports Bootstrap, Tailwind CSS, React and Vue, but more frameworks may be added shortly. In addition to installing on your computer, a demo website is available, which also requires your OpenAI key.