Skip to main content
All CollectionsStyles
How to build a good style dataset?
How to build a good style dataset?
Updated over a week ago

💡A dataset is a collection of images you want the AI to learn a style from

  • How many images are required: 5-30

    For a broad and versatile style, the recommended ratio of images in the dataset is:

    • 70% people

    • 20% landscapes

    • 10% animals and objects

    The bigger the variety of what subjects the images depict is, the more flexible the style will be to prompt with for things that aren’t in the dataset → vice versa, if the dataset contains only portraits, it will be hard to generate images with it that aren’t portraits. Even further, if the dataset only contains portraits of women, it will be hard to generate portraits of men with the style.

  • Image quality: Use high-quality, clear images. The quality of the images should be as good as possible. If pixelation, noise, image artifacts, or watermarks are present in the training images, chances are high the AI will pick up on those.

  • Image size: Training images must be 1:1. After uploading, images can be cropped in the app to ensure optimal composition.

  • Define Clear Objectives: Before sourcing or creating images, define what you want the AI to achieve. If you're aiming for a particular style, be specific about what that style entails. Images should have a consistent and cohesive style. One way to approach this is to consider whether all the images fit into the same overall theme or world, or asking: what do the images have in common?

  • Identifying Aesthetic Styles: Determine the desired aesthetic, which could be based on color schemes, artistic styles or mediums, or specific characters or visual themes. Knowing your target aesthetic will help in curating a coherent dataset.

  • Avoid images with text: Text and logos can influence the AI and might get regenerated in incoherent ways. Unless text is part of the aesthetic you're aiming for, avoid images with embedded text or signs as it can lead to unpredictable results.

  • Check for image rights: Ensure you have the rights to use the images for training. Using protected images without permission can lead to legal issues. We recommend using your AI generated images to refine and develop your signature styles.

  • Review Images: Review all images before training to ensure consistency in style and quality. Remove any outliers which don't fit the aesthetic or quality you're aiming for.

  • Test and Iterate: Once you've compiled your dataset, use it to create your style. Test the outputs with different prompts and iterate on the dataset if necessary. If the results consistently show undesired features or miss desired ones, you might need to modify your dataset and recreate the style to correct this.

  • Share your style and seek feedback:

    Publish and share your custom style on the app and gather feedback on the quality and desired improvements. This can help in identifying any gaps or issues in the dataset that you might have missed.

🖼️ Character Design Example:


  • Create recognizable characters by specifying features such as hair color and style, eye color, clothing, and accessories.

  • Assign specific colors to your characters for a more distinctive appearance.

Best Practices

  • Compile a dataset of images that depict your character in a variety of poses and settings to prevent overfitting and maintain stylistic flexibility.

  • Use images of the character against diverse backgrounds, such as indoor, outdoor, and solid-color backdrops.

  • Include images with different facial expressions — for instance, where the character is smiling or looks angry — to ensure a range of expressions.

  • Ensure that the dataset contains close-up, medium, and full-body images of your character for versatile recreations.

  • If certain places or objects are integral to your character's story or setting, include 1-2 images of these elements in the style of your character. This allows for the creation of scenes that are inspired by your character's aesthetic, even when the character is not present.

  • Incorporate a variety of artistic mediums into your training set to explore different styles; for example, include depictions of your character in both illustrated and photorealistic styles.

🖼️ Example 1:

In this dataset all images feature the same color scheme and mostly flat shapes, so those would be the style elements the AI would pick up on the strongest. Each image has a different subject, so it won’t associate the style with a specific subject.

→ Images generated with a style made with a dataset containing the 5 images above:

The prompts are:

  1. “a beautiful daisy flower, 8k, high quality”

  2. “portrait of a beautiful woman with long hair, masterpiece”

  3. “an elegant car on a highway, 8k, high quality”

💡 Images need to be in 1:1 ratio and 512x512px

🖼️ Example 2:

The common element in these images is the isometric view and all of them depict a building, so that’s what the AI will focus on and it can be hard to prompt for images with other subjects & perspectives. Because beige dominates 4 of 5 images, the results generated will be biased towards that.

→ Images generated with a style made from the 5 images above:

The prompts are:

  1. “a red hospital building, emergency station”

  2. “a greenhouse made of glass, victorian, lots of flowers”

  3. “a pyramid, lots of plants and flowers, futuristic, highly detailed, pink yellow and blue, high contrast”

Did this answer your question?