Image#0: Cabaret Dancer Red Bob Wig A case of image synthesis in Midjourney Screenpunk, 2024 March 9th This blog entree details the creation of the above image, Cabaret Dancer (image#0), over synthesis of two distinct source images, Relation Therapy and Gateway (image#1 and #2), that were produced in previous image generation sessions. The entree follows the steps leading to the creation of Cabaret Dancer.
At the end, some evaluative comments are made on the results. Initially, as a methodical note, there was no explicit plan on how to achieve synthesis. Instead, the approach relied on the intended analysis of various prompt elements to bring out suggestions for the assemblage of new prompts, that could guide the synthesis. 1. Synthesis Objective The intend of the synthesis is to merge elements from a flat, somewhat surrealist image (image#1) with a more expansive, three-dimensional one (of image#2). The objective is to create a vibrant and surreal image style and visualizations. The source images can be interpreted as follows: Relation Therapy exhibits a surrealist aesthetic akin to two-dimensional pop art. The depiction of a cool, somewhat distant lady on a couch, wearing a blue eyeshade and a peculiar cap featuring a peacock eye, transfers an air of intrigue. Her gesturing hand appears to dismiss or engage with something or someone — an ambiguous interaction with potential narrative implications. The contrasting, primary colors employed in the image contribute to its striking presence. In contrast Gateway captivates a three-dimensional depth. The swirling dress of the dancer leads the viewer’s gaze towards a blurred vanishing point, evoking a sense of movement and exploration. The dancer’s graceful pose and the soft illumination on the complementary shades of orange and yellow in her dress suggest a welcoming path into an unknown realm, inviting curiosity about the world hinted at by the gateway. The contrast between the static nature of Relation Therapy and the dynamic energy of Gateway is stark. The former casts rigid, flat, and edgy qualities and colors, while the latter displays fluidity, spaciousness, and organic curves. It is these contrasts that the synthesis seeks to reconcile and harmonize. Image#1: Relation Therapy Image#2: Gateway 2. Image Descriptions To obtain a fresh starting point for synthesis, instead of invoking the original prompts of the source images, I opted to utilize the descriptions provided by Midjourney’s image-to-text command (/describe). When employed, Midjourney generates four descriptions for each specified image (image#3). When the descriptions are used as prompts themselves, the resulting images do not replicate the source image, but rather offer an interpretation of it. Image#3: Image-to-text-descriptions for both source images 3. Image Regeneration When a description is used as a prompt, over the text-to-image functionality (/imagine), Midjourney outputs four images. Consequently, the eight descriptions in image #3 serve as prompts for a total of 32 new images. To prevent getting overwhelmed by the exponential increase in images, the focus of analysis will only be on the prompts that regenerated the most promising remake of each source image. Images #4 and #5 were considered the most promising ones. Their respective prompts:
The bold accentuated elements reappear in the prompt analysis below. As a general note on the prompts: during the sessions I output images on 16:9 aspect ratio. I occasionaly dump in a tuned style (ElW0Aru) to avoid typical Midjourney styled woman. Sometimes I add the element hyperrealism. Image#4 Regenerated Relation Therapy Image#5 Regenerated Gateway 4. Prompt Analysis The objective of the prompt analysis is to gain a better understanding of the elements utilized in the source prompts and their impact on the resulting images. By analyzing the elements, it can be estimated how they contribute to the synthesis process. Bold and Graphic Pop Art-Inspired Designs Pop art emerged in the 1950s as a departure from abstract and conceptual art, focusing instead on everyday objects and mass consumption culture. Notable pop artists include Andy Warhol, Roy Lichtenstein, David Hockney, and Richard Hamilton. A contemporary example from a 2016 ad campaign from Sagmeister & Walsh (image#6) showcases the evolution of pop art, relaying it back to commercial imagery as presentation form. The graphic nature of pop art tends to produce a 2D cartoon-like appearance that contrasts with the 3D goal of the synthesis. Image#6: Advertisement from Sagmeister & Walsh Fashion Illustration Fashion illustration serves as a precursor to the production of clothes, allowing designers to brainstorm ideas visually. Unlike fashion sketches, fashion illustrations focus more on the figure wearing the clothes, rather than specific clothing items. The depiction of fashion models walking on the catwalk in fashion illustrations can be used to add swag to a prompt. Image#7: Fashion Illustration from MelEesa Lorett Eye-Catching Detail An eye-catching detail is something very noticeable and attractive. In both image#1 and image#4 the term eye-catching detail expresses double meaning. The eyes are literally presented as eye-catching elements. Matte Photo A matte finish in photos lacks shine, offering a flat appearance ideal for displaying in bright areas like living rooms or offices. Matte quality can lend an introverted presentation quality to images. Dotted Patterns Dotted patterns, often associated with pop art imitations, can vary in size and texture, offering potential for adding dynamics to synthesis beyond traditional Lichtenstein-style dots. Image#8: A non-Lichtenstein dotted surface Woman Dancing in Floral Fire Image#5 features a woman dancing in a floral fire background, with the floral fire representing a collection of flowers blossoming like flames. The woman’s dancing adds movement and spatial extension to the image. Fine Feather Details Fine feather details materialize on the woman’s pantsuit, suggesting interesting and irregular patterns. Peacock feathers, as used in the original prompt of image#1 for example, offer eye-catching details when combined with other elements. 5. Additional Analysis: a Red Bob Wig At this point the belated impression emerged that the floral fire image (image#5) is rather unimpressive. At least it is not as impressive as it’s counterpart (image#4), and, by far, not as attractive as it’s original source image (image#2). Upon visual examination of acquired materials, to find a way out of the emerging impasse, the bob wig featured in the Sagmeister & Walsh ad (image#6) appeared as a standout element. The distinctive presence and curvy shape of the bob wig seems to provide an opportunity to bridge the flat 2D and spacious 3D aspects of the souce images. Consequently, a vibrant bob wig was included in the analysis. Over the processing of a downloaded commercial image of a girl with a red bob wig (image#9) into MJ-descriptions, and subsequential, the regeneration of the descriptions into new images (a.o. image#10), a fresh prompt (prompt#10) was selected for additional analysis. Image#9 a wig ad (left) and image#10 a regenerated wig ad (right) A Woman is Wearing a Red Bob Wig To depict a woman wearing a red bob wig, the description should specify “a woman is wearing a wig” rather than “a woman with a wig.” Additionally, variations such as “a woman in a red bob wig” or “a woman with a red bob wig” are effective. Connecting a person with an object can be done over specification of an activity in which both are involved, “a woman holding a glass of champagne” or “a girl riding a tiger.” In the Style of Fauvist Color Choices Fauvist colors emphasizes a limited set of basic colors and adjacent shades in paintings. Fauvist artists, like Matisse and André Derain, like to use vibrant and bold colors. Cabaret Scenes Cabaret originated as entertainment in 19th-century French bars and restaurants. In the seventies the phenomenon was popularized over the movie Cabaret starring Liza Minelli. Cabaret scenes refer to poses and clothing styles associated with the cabaret culture, adding a frivolous and distinctive entertainment style to image generation. Image#11: Liza Minelli in Cabaret (left) and a model posing for the Liza Minnelli Cabaret Costume (right) Whitcomb-Girls Whitcomb-Girls refers to the illustrations of women by US illustrator Jon Whitcomb. His illustrations depict fashionable, upper-class women with a warm yet unapproachable aura, contributing to the cool and superior expression of the ladies aimed for by the synthesis. Image#12: Whitcomb-Girls 6. Image Synthesis After analyzing prompt elements, in which red bob wig, cabaret scenes and Whitcomb-girls felt like real discoveries, I felt excited to start prompt synthesis and find out about the imagery it would generate. Synthesis#1: Get the 2D Out The generated images (a.o. image#13) exhibit an explicit 2D pop art style, deviating from the intended synthesis objective. After reassessment, elements such as the MJ image style and the keyword dotted are discarded to improve results The new series (a.o. image#14) retains a distinctive aesthetic appeal but still leans towards a pop art style. To mitigate associations with traditional art mediums, the elements fashion-illustration and Georgy Kurasov are removed. Image#15 emerges as a notable outcome, showcasing a swirly 3D element while maintaining the desired cool aura. More depth should be added. This image marks the finish of the first synthesis attempt. Image#13 Image#14 Image#15 Synthesis#2: Retro Hollywood Glamour The second synthesis starts by converting Image#15 into MJ-descriptions. After some generations and modifications a new prompt was assembled.
The resulting images feature striking portrayals of women with red bob wigs, with Image#16 and #17 among the highlights. Image#18 approaches the synthesis goal, but retains a pop artsy 2D appearance, leading to the decision to halt the second synthesis for a reflexive break. Image#16 Image#17 Image#18 Synthesis#3: Involving the Karencore Given image#15’s alignment with the 3D aim of the exercise, as a start for the third synthesis, its prompt was simplified, omitting elements like dotted dots and spontaneously adding, instead of Whitcomb-girl, a karencore element. Karencore, I found out recently, refers to the bodily expressions of the Karen, a middle class woman that acquired the status of internet meme, regularly seen with a bob cut, and publicly entering a tantrum, when she experiences personal wrong doing of some kind. The resulting images (image#19) are visually appealing, but adjustments were made to make Karen look directly into the camera. In contrast with her internet reputation, Karen appears in image#20 as a rather attractive, cool woman. The images lean towards a red woosh, however, and a touch of chaos was added to the parameters to try to overcome this effect. Image#21, then, contains miraculous qualities and can be considered to bring the third synthesis to a satisfying end. Image#19 Image#20 Image#21 Final remarks Wrapping it up, it can be concluded that the prompt synthesis resulted in various types of imagery and image styles, of which some can be considered to have distinct aesthetic qualities. On top of that, the results of the third synthesis comply with the synthesis objective. A spacious 3D environment has been generated, where a cool lady subtly invites the viewer to come in, and explore, beyond the feathers and bokeh lights, unknown terrains. Revisiting source images #1 and #2, one can argue that image #21 effectively brings elements of both images together. A clear element of surrealism, as stated in the synthesis objective, is, however, missing from the output. Silently I was assuming/hoping that the surreal would enter the synthesis, without adding explicit prompt elements like surreal and surrealism, trying to avoid obvious reference to the works of Dali, Ernst, Miró, and who have you. That may have been a unlucky choice, causing elements of surrealism to dissolve between Karen’s feathers. As such, the presented imagery provides a point of reflection for the question how to continue the search for the surreal. Also included in this reflection should be image#22 below. In between things it was generated by feeding Midjourney the raw interpretation of source image Relation Therapy (image#1) from the objective section. Ironically, without additional analysis and synthesis, it makes for the most surreal of all images here. |
Explore More ... | |
![]() |
"This has nothing to do with photography"
|
![]() |
In the Style of Hilma af Klint
|
![]() |
Pattern Variation
|
![]() |
Inconvenient Images
|
![]() |
Oil on Canvas
|
Or visit all explorations |