Category
Prototyping
Image Generation.
Introduction
To further improve the generated pages there was a need to generate and include images. Images can greatly improve the look and theme of the context of the page. Because images too can be generated using AI, I wanted to integrate image generation functionality into the tool. This feature would make the pages more relevent to a theme and interesting.
Abstract
The first challenge was figuring out how to generate images that would fit the context of the page. First I tried using DALL-E 3 directly with the page prompts provided by users. However this approach did not get great results, because the images generated were often not relevant or fitting to the page.
To fix this I decided to use a two step process. First I used GPT-4 to read the user's page prompt and generate relevant prompts for DALL-E 3. Since GPT-4 understands the context of the page better, it could create better and more relevant image generation prompts.
Once I had these relevant prompts, I used DALL-E 3 to generate the images. These images were then saved and their URLs were put back into GPT-4 before it generated the final HTML page. This way GPT-4 could include the images directly into the content, which makes sure that they integrate well into the web page. To do this, I created the following set of prompts for GPT-4:
A user prompt like: "Page about an upcoming easter event with a brunch and bingo." would return image prompts like:
- "A festive Easter brunch table set with colorful spring decorations, Easter eggs, and flowers, with a bright and cheerful atmosphere."
- "A group of people enjoying a lively Easter bingo game, with colorful bingo cards and markers, surrounded by Easter-themed decorations."
- "A cheerful family gathering for an Easter brunch, with a table filled with delicious food and Easter decorations, in a sunny and bright room."
- "An Easter event flyer with illustrations of brunch items, Easter eggs, a bingo card, and spring flowers, in pastel colors and playful fonts."
By using this approach I could make sure that the images were both relevant and also showed the overall theme of the page. Users could now generate a complete web page with both text and images by simply providing a single prompt.
Demo
Conclusion
This was an important feature to the tool because images play a big role in showing themes and improving the look of the page. The ability to generate unique images for each page meant that users didn't need to supply their own images which makes the tool more user friendly and usable.