Close Menu
    Facebook X (Twitter) Instagram
    • Privacy Policy
    • Terms Of Service
    • Legal Disclaimer
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    Facebook X (Twitter) Instagram
    Brief ChainBrief Chain
    • Home
    • Crypto News
      • Bitcoin
      • Ethereum
      • Altcoins
      • Blockchain
      • DeFi
    • AI News
    • Stock News
    • Learn
      • AI for Beginners
      • AI Tips
      • Make Money with AI
    • Reviews
    • Tools
      • Best AI Tools
      • Crypto Market Cap List
      • Stock Market Overview
      • Market Heatmap
    • Contact
    Brief ChainBrief Chain
    Home»AI News»A Coding Guide to High-Quality Image Generation, Control, and Editing Using HuggingFace Diffusers
    A Coding Guide to High-Quality Image Generation, Control, and Editing Using HuggingFace Diffusers
    AI News

    A Coding Guide to High-Quality Image Generation, Control, and Editing Using HuggingFace Diffusers

    February 21, 20265 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email
    synthesia


    In this tutorial, we design a practical image-generation workflow using the Diffusers library. We start by stabilizing the environment, then generate high-quality images from text prompts using Stable Diffusion with an optimized scheduler. We accelerate inference with a LoRA-based latent consistency approach, guide composition with ControlNet under edge conditioning, and finally perform localized edits via inpainting. Also, we focus on real-world techniques that balance image quality, speed, and controllability.

    !pip -q uninstall -y pillow Pillow || true
    !pip -q install –upgrade –force-reinstall “pillow<12.0”
    !pip -q install –upgrade diffusers transformers accelerate safetensors huggingface_hub opencv-python

    import os, math, random
    import torch
    import numpy as np
    import cv2
    from PIL import Image, ImageDraw, ImageFilter
    from diffusers import (
    StableDiffusionPipeline,
    StableDiffusionInpaintPipeline,
    ControlNetModel,
    StableDiffusionControlNetPipeline,
    UniPCMultistepScheduler,
    )

    We prepare a clean and compatible runtime by resolving dependency conflicts and installing all required libraries. We ensure image processing works reliably by pinning the correct Pillow version and loading the Diffusers ecosystem. We also import all core modules needed for generation, control, and inpainting workflows.

    notion
    def seed_everything(seed=42):
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)

    def to_grid(images, cols=2, bg=255):
    if isinstance(images, Image.Image):
    images = [images]
    w, h = images[0].size
    rows = math.ceil(len(images) / cols)
    grid = Image.new(“RGB”, (cols*w, rows*h), (bg, bg, bg))
    for i, im in enumerate(images):
    grid.paste(im, ((i % cols)*w, (i // cols)*h))
    return grid

    device = “cuda” if torch.cuda.is_available() else “cpu”
    dtype = torch.float16 if device == “cuda” else torch.float32
    print(“device:”, device, “| dtype:”, dtype)

    We define utility functions to ensure reproducibility and to organize visual outputs efficiently. We set global random seeds so our generations remain consistent across runs. We also detect the available hardware and configure precision to optimize performance on the GPU or CPU.

    seed_everything(7)
    BASE_MODEL = “runwayml/stable-diffusion-v1-5”

    pipe = StableDiffusionPipeline.from_pretrained(
    BASE_MODEL,
    torch_dtype=dtype,
    safety_checker=None,
    ).to(device)

    pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)

    if device == “cuda”:
    pipe.enable_attention_slicing()
    pipe.enable_vae_slicing()

    prompt = “a cinematic photo of a futuristic street market at dusk, ultra-detailed, 35mm, volumetric lighting”
    negative_prompt = “blurry, low quality, deformed, watermark, text”

    img_text = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=25,
    guidance_scale=6.5,
    width=768,
    height=512,
    ).images[0]

    We initialize the base Stable Diffusion pipeline and switch to a more efficient UniPC scheduler. We generate a high-quality image directly from a text prompt using carefully chosen guidance and resolution settings. This establishes a strong baseline for subsequent improvements in speed and control.

    LCM_LORA = “latent-consistency/lcm-lora-sdv1-5”
    pipe.load_lora_weights(LCM_LORA)

    try:
    pipe.fuse_lora()
    lora_fused = True
    except Exception as e:
    lora_fused = False
    print(“LoRA fuse skipped:”, e)

    fast_prompt = “a clean product photo of a minimal smartwatch on a reflective surface, studio lighting”
    fast_images = []
    for steps in [4, 6, 8]:
    fast_images.append(
    pipe(
    prompt=fast_prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=steps,
    guidance_scale=1.5,
    width=768,
    height=512,
    ).images[0]
    )

    grid_fast = to_grid(fast_images, cols=3)
    print(“LoRA fused:”, lora_fused)

    W, H = 768, 512
    layout = Image.new(“RGB”, (W, H), “white”)
    draw = ImageDraw.Draw(layout)
    draw.rectangle([40, 80, 340, 460], outline=”black”, width=6)
    draw.ellipse([430, 110, 720, 400], outline=”black”, width=6)
    draw.line([0, 420, W, 420], fill=”black”, width=5)

    edges = cv2.Canny(np.array(layout), 80, 160)
    edges = np.stack([edges]*3, axis=-1)
    canny_image = Image.fromarray(edges)

    CONTROLNET = “lllyasviel/sd-controlnet-canny”
    controlnet = ControlNetModel.from_pretrained(
    CONTROLNET,
    torch_dtype=dtype,
    ).to(device)

    cn_pipe = StableDiffusionControlNetPipeline.from_pretrained(
    BASE_MODEL,
    controlnet=controlnet,
    torch_dtype=dtype,
    safety_checker=None,
    ).to(device)

    cn_pipe.scheduler = UniPCMultistepScheduler.from_config(cn_pipe.scheduler.config)

    if device == “cuda”:
    cn_pipe.enable_attention_slicing()
    cn_pipe.enable_vae_slicing()

    cn_prompt = “a modern cafe interior, architectural render, soft daylight, high detail”
    img_controlnet = cn_pipe(
    prompt=cn_prompt,
    negative_prompt=negative_prompt,
    image=canny_image,
    num_inference_steps=25,
    guidance_scale=6.5,
    controlnet_conditioning_scale=1.0,
    ).images[0]

    We accelerate inference by loading and fusing a LoRA adapter and demonstrate fast sampling with very few diffusion steps. We then construct a structural conditioning image and apply ControlNet to guide the layout of the generated scene. This allows us to preserve composition while still benefiting from creative text guidance.

    mask = Image.new(“L”, img_controlnet.size, 0)
    mask_draw = ImageDraw.Draw(mask)
    mask_draw.rectangle([60, 90, 320, 170], fill=255)
    mask = mask.filter(ImageFilter.GaussianBlur(2))

    inpaint_pipe = StableDiffusionInpaintPipeline.from_pretrained(
    BASE_MODEL,
    torch_dtype=dtype,
    safety_checker=None,
    ).to(device)

    inpaint_pipe.scheduler = UniPCMultistepScheduler.from_config(inpaint_pipe.scheduler.config)

    if device == “cuda”:
    inpaint_pipe.enable_attention_slicing()
    inpaint_pipe.enable_vae_slicing()

    inpaint_prompt = “a glowing neon sign that says ‘CAFÉ’, cyberpunk style, realistic lighting”

    img_inpaint = inpaint_pipe(
    prompt=inpaint_prompt,
    negative_prompt=negative_prompt,
    image=img_controlnet,
    mask_image=mask,
    num_inference_steps=30,
    guidance_scale=7.0,
    ).images[0]

    os.makedirs(“outputs”, exist_ok=True)
    img_text.save(“outputs/text2img.png”)
    grid_fast.save(“outputs/lora_fast_grid.png”)
    layout.save(“outputs/layout.png”)
    canny_image.save(“outputs/canny.png”)
    img_controlnet.save(“outputs/controlnet.png”)
    mask.save(“outputs/mask.png”)
    img_inpaint.save(“outputs/inpaint.png”)

    print(“Saved outputs:”, sorted(os.listdir(“outputs”)))
    print(“Done.”)

    We create a mask to isolate a specific region and apply inpainting to modify only that part of the image. We refine the selected area using a targeted prompt while keeping the rest intact. Finally, we save all intermediate and final outputs to disk for inspection and reuse.

    In conclusion, we demonstrated how a single Diffusers pipeline can evolve into a flexible, production-ready image generation system. We explained how to move from pure text-to-image generation to fast sampling, structural control, and targeted image editing without changing frameworks or tooling. This tutorial highlights how we can combine schedulers, LoRA adapters, ControlNet, and inpainting to create controllable and efficient generative pipelines that are easy to extend for more advanced creative or applied use cases.

    Check out the Full Codes here. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.



    Source link

    livechat
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    CryptoExpert
    • Website

    Related Posts

    SAP and Google Cloud deploy agentic commerce architecture

    June 21, 2026

    Yandex Open-Sources YaFF: A Zero-Copy Wire Format for Protobuf With Near-Struct Read Speed

    June 20, 2026

    MIT in the media: For the future of tech, “Massachusetts can absolutely lead” | MIT News

    June 19, 2026

    Adobe embeds agentic AI workflows across Creative Cloud, shifting from media generation to production orchestration

    June 18, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    aistudios
    Latest Posts

    NotebookLM Just Changed AI Forever 🤯 | 10 Hidden NotebookLM Hacks You Need to Know (2026)

    June 21, 2026

    Recent Pi Network Developments, Concerning Dogecoin Signals, and More: Bits Recap June 19

    June 21, 2026

    Dash Weighs Philippine Entry as Crypto Firms Navigate Regulation

    June 21, 2026

    ETH Trapped Below $1.7K Raises Call For Another “Selling Wave”

    June 21, 2026

    A Canadian ETF I’d Seriously Consider Adding to My Portfolio in 2026

    June 21, 2026
    ledger
    LEGAL INFORMATION
    • Privacy Policy
    • Terms Of Service
    • Legal Disclaimer
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    Top Insights

    Bitcoin Clings to $64,000 as Iran Closures Hormuz and US Threatens Retaliation

    June 21, 2026

    Japan’s crypto pension fund, Jaredfromsubway exploited

    June 21, 2026
    kraken
    Facebook X (Twitter) Instagram Pinterest
    © 2026 BriefChain.com - All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.