Automated Visual Story Synthesis with Character Trait Control

Open Access
Conference Proceedings
Authors: Yuetian ChenBowen ShiPeiru LiuRuohua LiMei Si

Abstract: Visual storytelling is an art form that has been utilized for centuries to communicate stories, convey messages, and evoke emotions. The images and text must be used in harmony to create a compelling narrative experience. With the rise of text-to-image generation models such as Stable Diffusion, it is becoming more promising to investigate methods of automatically creating illustrations for stories. However, these diffusion models are usually developed to generate a single image, resulting in a lack of consistency be- tween figures and objects across different illustrations of the same story, which is especially important in stories with human characters.This work introduces a novel technique for creating consistent human figures in visual stories. This is achieved in two steps. The first step is to collect human portraits with various identifying characteristics, such as gender and age, that describe the character. The second step is to use this collection to train DreamBooth to generate a unique token ID for each type of character. These IDs can then be used to replace the names of the story characters in the image-generation process. By combining these two steps, we can create controlled human figures for various visual storytelling contexts.

Keywords: Visual Storytelling, Stable Diffusion, DreamBooth, Story Synthesis

DOI: 10.54941/ahfe1003275

Cite this paper: