txt2img

Discussion

Eduard Lebedyuk · Sep 16, 2022

#Artificial Intelligence (AI) #Machine Learning (ML) #InterSystems IRIS

Several models, such as DALL-E, Midjourney, and StableDiffusion, became available recently. All these models generate digital images from natural language descriptions. The most interesting one, in my opinion, is StableDiffusion which is open source - released barely a few weeks ago. There's now an entire community trying to leverage it for various use cases.

The proliferation of text2img models would bring changes to the fields of marketing, advertising, design, and learning - anywhere requiring a large amount of image generation. Artists and even people without technical or artistic know-how can prototype and iterate on visualization with unbelievable ease.
And maybe healthcare, too - I can certainly foresee automatic image generation from text descriptions being helpful in, for example, psychiatric care.

Here are some outputs from InterSystems, IRIS, Interoperability, HealthShare, and similar prompts (generated on StableDiffusion v1.4, seed: 714159486, width:v512, height: 512, steps: 50, cfg_scale: 8 sampler: k_euler_a, upscaler: RealESRGAN_x4). You can see several characteristic artifacts, such as gibberish text. The abstract subject matter (from the model "point of view") also produces wildly different outputs.

And these are results for InterSystems IRIS for some reason:

And here are the training images from LAION-5B tagged InterSystems.