Red Bob Wig

A case of image synthesis in Midjourney

Screenpunk, 2024 March 9th

This blog entree details the creation of the above image, Cabaret Dancer (image#0), over synthesis of two distinct source images, Relation Therapy and Gateway (image#1 and #2), that were produced in previous image generation sessions.

The entree follows the steps leading to the creation of Cabaret Dancer.

Explication of the objective of the source images,
Conversion of source images into descriptions over Midjourney’s describe function (image#3),
Regeneration of the descriptions into new imagery over Midjourney’s imagine function (image #4 and #5),
Research of possible visual effects of the prompt elements that generated the new images,
Discovery and analysis of a unifying visual element (in the shape of a red bob wig) that enables the transition from analysis to synthesis,
Execution of three strains of related synthesis.

At the end, some evaluative comments are made on the results.

Initially, as a methodical note, there was no explicit plan on how to achieve synthesis. Instead, the approach relied on the intended analysis of various prompt elements to bring out suggestions for the assemblage of new prompts, that could guide the synthesis.

1. Synthesis Objective

The intend of the synthesis is to merge elements from a flat, somewhat surrealist image (image#1) with a more expansive, three-dimensional one (of image#2). The objective is to create a vibrant and surreal image style and visualizations. The source images can be interpreted as follows:

Relation Therapy exhibits a surrealist aesthetic akin to two-dimensional pop art. The depiction of a cool, somewhat distant lady on a couch, wearing a blue eyeshade and a peculiar cap featuring a peacock eye, transfers an air of intrigue. Her gesturing hand appears to dismiss or engage with something or someone — an ambiguous interaction with potential narrative implications. The contrasting, primary colors employed in the image contribute to its striking presence.

In contrast Gateway captivates a three-dimensional depth. The swirling dress of the dancer leads the viewer’s gaze towards a blurred vanishing point, evoking a sense of movement and exploration. The dancer’s graceful pose and the soft illumination on the complementary shades of orange and yellow in her dress suggest a welcoming path into an unknown realm, inviting curiosity about the world hinted at by the gateway.

The contrast between the static nature of Relation Therapy and the dynamic energy of Gateway is stark. The former casts rigid, flat, and edgy qualities and colors, while the latter displays fluidity, spaciousness, and organic curves. It is these contrasts that the synthesis seeks to reconcile and harmonize.

Image#1: Relation Therapy

Image#2: Gateway

2. Image Descriptions

To obtain a fresh starting point for synthesis, instead of invoking the original prompts of the source images, I opted to utilize the descriptions provided by Midjourney’s image-to-text command (/describe).

When employed, Midjourney generates four descriptions for each specified image (image#3). When the descriptions are used as prompts themselves, the resulting images do not replicate the source image, but rather offer an interpretation of it.

Image#3: Image-to-text-descriptions for both source images

3. Image Regeneration

When a description is used as a prompt, over the text-to-image functionality (/imagine), Midjourney outputs four images. Consequently, the eight descriptions in image #3 serve as prompts for a total of 32 new images. To prevent getting overwhelmed by the exponential increase in images, the focus of analysis will only be on the prompts that regenerated the most promising remake of each source image.

Images #4 and #5 were considered the most promising ones. Their respective prompts:

Prompt#4: a woman with a blue and red background, in the style of daz3d, bold and graphic pop art-inspired designs, fashion-illustration, matte photo, dotted, eye-catching detail, georgy kurasov

Prompt#5: a woman dancing in floral fire, in the style of detailed science fiction illustrations, red and gold, art of the congo, uhd image, fine feather details, alexandr averin, dramatic splendor

The bold accentuated elements reappear in the prompt analysis below.

As a general note on the prompts: during the sessions I output images on 16:9 aspect ratio. I occasionaly dump in a tuned style (ElW0Aru) to avoid typical Midjourney styled woman. Sometimes I add the element hyperrealism.

Image#4 Regenerated Relation Therapy

Image#5 Regenerated Gateway

4. Prompt Analysis

The objective of the prompt analysis is to gain a better understanding of the elements utilized in the source prompts and their impact on the resulting images. By analyzing the elements, it can be estimated how they contribute to the synthesis process.

Bold and Graphic Pop Art-Inspired Designs

Pop art emerged in the 1950s as a departure from abstract and conceptual art, focusing instead on everyday objects and mass consumption culture. Notable pop artists include Andy Warhol, Roy Lichtenstein, David Hockney, and Richard Hamilton. A contemporary example from a 2016 ad campaign from Sagmeister & Walsh (image#6) showcases the evolution of pop art, relaying it back to commercial imagery as presentation form. The graphic nature of pop art tends to produce a 2D cartoon-like appearance that contrasts with the 3D goal of the synthesis.

Image#6: Advertisement from Sagmeister & Walsh

Fashion Illustration

Fashion illustration serves as a precursor to the production of clothes, allowing designers to brainstorm ideas visually. Unlike fashion sketches, fashion illustrations focus more on the figure wearing the clothes, rather than specific clothing items. The depiction of fashion models walking on the catwalk in fashion illustrations can be used to add swag to a prompt.

Image#7: Fashion Illustration from MelEesa Lorett

Eye-Catching Detail

An eye-catching detail is something very noticeable and attractive. In both image#1 and image#4 the term eye-catching detail expresses double meaning. The eyes are literally presented as eye-catching elements.

Matte Photo

A matte finish in photos lacks shine, offering a flat appearance ideal for displaying in bright areas like living rooms or offices. Matte quality can lend an introverted presentation quality to images.

Dotted Patterns

Dotted patterns, often associated with pop art imitations, can vary in size and texture, offering potential for adding dynamics to synthesis beyond traditional Lichtenstein-style dots.

Image#8: A non-Lichtenstein dotted surface

Woman Dancing in Floral Fire

Image#5 features a woman dancing in a floral fire background, with the floral fire representing a collection of flowers blossoming like flames. The woman’s dancing adds movement and spatial extension to the image.

Fine Feather Details

Fine feather details materialize on the woman’s pantsuit, suggesting interesting and irregular patterns. Peacock feathers, as used in the original prompt of image#1 for example, offer eye-catching details when combined with other elements.

5. Additional Analysis: a Red Bob Wig

At this point the belated impression emerged that the floral fire image (image#5) is rather unimpressive. At least it is not as impressive as it’s counterpart (image#4), and, by far, not as attractive as it’s original source image (image#2).

Upon visual examination of acquired materials, to find a way out of the emerging impasse, the bob wig featured in the Sagmeister & Walsh ad (image#6) appeared as a standout element. The distinctive presence and curvy shape of the bob wig seems to provide an opportunity to bridge the flat 2D and spacious 3D aspects of the souce images. Consequently, a vibrant bob wig was included in the analysis.

Over the processing of a downloaded commercial image of a girl with a red bob wig (image#9) into MJ-descriptions, and subsequential, the regeneration of the descriptions into new images (a.o. image#10), a fresh prompt (prompt#10) was selected for additional analysis.

Image#9 a wig ad (left) and image#10 a regenerated wig ad (right)

A Woman is Wearing a Red Bob Wig

To depict a woman wearing a red bob wig, the description should specify “a woman is wearing a wig” rather than “a woman with a wig.” Additionally, variations such as “a woman in a red bob wig” or “a woman with a red bob wig” are effective. Connecting a person with an object can be done over specification of an activity in which both are involved, “a woman holding a glass of champagne” or “a girl riding a tiger.”

In the Style of Fauvist Color Choices

Fauvist colors emphasizes a limited set of basic colors and adjacent shades in paintings. Fauvist artists, like Matisse and André Derain, like to use vibrant and bold colors.

Cabaret Scenes

Cabaret originated as entertainment in 19th-century French bars and restaurants. In the seventies the phenomenon was popularized over the movie Cabaret starring Liza Minelli. Cabaret scenes refer to poses and clothing styles associated with the cabaret culture, adding a frivolous and distinctive entertainment style to image generation.

Image#11: Liza Minelli in Cabaret (left) and a model posing for the Liza Minnelli Cabaret Costume (right)

Whitcomb-Girls

Whitcomb-Girls refers to the illustrations of women by US illustrator Jon Whitcomb. His illustrations depict fashionable, upper-class women with a warm yet unapproachable aura, contributing to the cool and superior expression of the ladies aimed for by the synthesis.

Image#12: Whitcomb-Girls

6. Image Synthesis

After analyzing prompt elements, in which red bob wig, cabaret scenes and Whitcomb-girls felt like real discoveries, I felt excited to start prompt synthesis and find out about the imagery it would generate.

Synthesis#1: Get the 2D Out

The generated images (a.o. image#13) exhibit an explicit 2D pop art style, deviating from the intended synthesis objective. After reassessment, elements such as the MJ image style and the keyword dotted are discarded to improve results

The new series (a.o. image#14) retains a distinctive aesthetic appeal but still leans towards a pop art style. To mitigate associations with traditional art mediums, the elements fashion-illustration and Georgy Kurasov are removed.

Image#15 emerges as a notable outcome, showcasing a swirly 3D element while maintaining the desired cool aura. More depth should be added. This image marks the finish of the first synthesis attempt.

Image#13

Image#14

Image#15

Synthesis#2: Retro Hollywood Glamour

The second synthesis starts by converting Image#15 into MJ-descriptions. After some generations and modifications a new prompt was assembled.

The resulting images feature striking portrayals of women with red bob wigs, with Image#16 and #17 among the highlights.

Image#18 approaches the synthesis goal, but retains a pop artsy 2D appearance, leading to the decision to halt the second synthesis for a reflexive break.

Image#16

Image#17

Image#18

Synthesis#3: Involving the Karencore

Given image#15’s alignment with the 3D aim of the exercise, as a start for the third synthesis, its prompt was simplified, omitting elements like dotted dots and spontaneously adding, instead of Whitcomb-girl, a karencore element.

Karencore, I found out recently, refers to the bodily expressions of the Karen, a middle class woman that acquired the status of internet meme, regularly seen with a bob cut, and publicly entering a tantrum, when she experiences personal wrong doing of some kind.

The resulting images (image#19) are visually appealing, but adjustments were made to make Karen look directly into the camera. In contrast with her internet reputation, Karen appears in image#20 as a rather attractive, cool woman. The images lean towards a red woosh, however, and a touch of chaos was added to the parameters to try to overcome this effect.

Image#21, then, contains miraculous qualities and can be considered to bring the third synthesis to a satisfying end.

Image#19

Image#20

Image#21

Final remarks

Wrapping it up, it can be concluded that the prompt synthesis resulted in various types of imagery and image styles, of which some can be considered to have distinct aesthetic qualities. On top of that, the results of the third synthesis comply with the synthesis objective. A spacious 3D environment has been generated, where a cool lady subtly invites the viewer to come in, and explore, beyond the feathers and bokeh lights, unknown terrains. Revisiting source images #1 and #2, one can argue that image #21 effectively brings elements of both images together.

A clear element of surrealism, as stated in the synthesis objective, is, however, missing from the output. Silently I was assuming/hoping that the surreal would enter the synthesis, without adding explicit prompt elements like surreal and surrealism, trying to avoid obvious reference to the works of Dali, Ernst, Miró, and who have you. That may have been a unlucky choice, causing elements of surrealism to dissolve between Karen’s feathers.

As such, the presented imagery provides a point of reflection for the question how to continue the search for the surreal. Also included in this reflection should be image#22 below. In between things it was generated by feeding Midjourney the raw interpretation of source image Relation Therapy (image#1) from the objective section. Ironically, without additional analysis and synthesis, it makes for the most surreal of all images here.

Explore More ...
	"This has nothing to do with photography" Although you bring up a straigthtforward point, I believe my images do more than just show skills.... view exploration
	In the Style of Hilma af Klint Af Klint kept her work hidden for the world because she was under the impression that her audience would not be ready for the conceptions, styles and messages of her work.... view exploration
	Pattern Variation Pattern Variation is Screenpunk's newest portfolio, a collection of twenty-five AI-generated artworks that highlight the power of pattern-making and experimentation. ... view exploration
	Inconvenient Images Not to get caught up in processes of tedious redefinition, I hopped over to Midjourney to get entangled in an unexpected visualization of carcinogenesis... view exploration
	Oil on Canvas Screenpunk experiments with AI-generated imagery that mimics the depth and texture of oil on canvas and oil on wood, creating works that feel painterly yet undeniably digital. ... view exploration
	Or visit all explorations