Exploring AI Images: Using The Same Prompt With Different Models

By earthnews
2 years Ago

Exploring AI Images: Using The Same Prompt With Different Models

Gallery assistants hold an artwork by Spanish artist Pablo Picasso entitled ‘Femme au beret et a la … [+] robe quadrillee’ (Marie-Therese Walter) with an estimate price in the region of 35 million pounds, (50 million dollars), during a photocall at Sotheby’s in central London on February 22, 2018. – RESTRICTED TO EDITORIAL USE – MANDATORY MENTION OF THE ARTIST UPON PUBLICATION – TO ILLUSTRATE THE EVENT AS SPECIFIED IN THE CAPTION (Photo by Daniel LEAL / AFP) / RESTRICTED TO EDITORIAL USE – MANDATORY MENTION OF THE ARTIST UPON PUBLICATION – TO ILLUSTRATE THE EVENT AS SPECIFIED IN THE CAPTION / RESTRICTED TO EDITORIAL USE – MANDATORY MENTION OF THE ARTIST UPON PUBLICATION – TO ILLUSTRATE THE EVENT AS SPECIFIED IN THE CAPTION (Photo by DANIEL LEAL/AFP via Getty Images)

AFP via Getty Images

As AI increasingly dominates the narrative in technology and business, most people’s understanding of it remains limited to tools like ChatGPT. However, one rapidly advancing area is AI image generation. You may be familiar with some tools in this space, but I aim to examine how different image generation models respond to the same prompt.

First, let’s briefly explore how AI image generation works and the mechanical differences between AI text and image generation.

How do image generation models work? Models like DALL-E are trained using vast datasets of images and, in some cases, accompanying text descriptions. During training, the AI is fed millions of image-text pairs, learning associations between words and visual concepts. When given a text prompt, the model generates a corresponding image by synthesizing pixels in alignment with the patterns and visual relationships from its training data. Essentially, the AI acts like a painter, creating ‘brush strokes’ based on its database of image-text pairs. This process can lead to bias, which we will explore further in this article.

How do text generation models work? In contrast, text-based AI models, such as GPT-4, are trained on extensive text data, learning language patterns, grammar, and context. When prompted, they generate text by predicting the most likely next word or phrase based on the input and their training, essentially ‘guessing’ the best next words based on your input.

The key difference between image and text generation is that AI must interpret your words and visualize the concept you present.

Testing Image Generation with the Same Prompt

One pitfall of image generation is that limited training data can lead to divergent or biased outputs. As a Bay Area-based contributor, I tested the same prompt across four different image generators: “An image of 4 friends drinking wine in Napa, CA on a sunny day.”

For this test, I used:

Dall-E
Firefly
Midjourney
Imagen

I restricted the test to the ‘first image’ output from each model, as those familiar with these tools know they generate multiple images per prompt. For Dall-E and Imagen, I accessed the images through Canva, which has separate apps for both. Here were the results:

Dall-E Output

Dall-E Output for An image of 4 friends drinking wine in Napa, CA on a sunny day

Dall-E

Firefly Output

Firefly Output for An image of 4 friends drinking wine in Napa, CA on a sunny day.

Adobe Firefly

Midjourney Output

Midjourney output for An image of 4 friends drinking wine in Napa, CA on a sunny day.

Midjourney

Imagen Output

Imagen output for An image of 4 friends drinking wine in Napa, CA on a sunny day

Imagen

The outputs tended to converge on similar imagery. Notably, Midjourney showed the most divergence among the four results, followed by Firefly. The outputs from Dall-E and Imagen were relatively similar based on anecdotal observations.

While image generation technology is advancing rapidly, it raises concerns about bias and other potential issues. As training data expands, these models will improve. However, with video generation nearing mainstream adoption through companies like Runway and Pika, extra caution is necessary when relying on text-to-image and text-to-video outputs to avoid reinforcing societal biases.

>>> Read full article>>>
Copyright for syndicated content belongs to the linked Source : Forbes – https://www.forbes.com/sites/sunilrajaraman/2023/12/29/exploring-ai-images-using-the-same-prompt-with-different-models/

Categories: Business
Tags: business Exploring images

First, let’s briefly explore how AI image generation works and the mechanical differences between AI text and image generation.

Testing Image Generation with the Same Prompt

Related Content

Kylie Jenner and Timothée Chalamet's Unforgettable Love Story: Get the Latest Entertainment News and Live Updates - August 18, 2024

Bitcoin extends correction following Mt. Gox’s billion transfer

Fed decision looms after US gov moves Bitcoin – here’s what to expect

LayerZero becomes default interoperability solution for Animoca Brands

Ethereum ETF flows turn positive as BlackRock fund logs $118 million inflows

Cardano’s Hoskinson: DeFi ecosystem just moving water in bathtub