Repository History
Explore all analyzed open source repositories

Leffa: Controllable Person Image Generation with Flow Fields in Attention
Leffa is a unified framework for controllable person image generation, enabling precise manipulation of appearance through virtual try-on and pose via pose transfer. This project addresses the common issue of fine-grained textural detail distortion by learning flow fields in attention, guiding target queries to correct reference keys. It achieves state-of-the-art performance, maintaining high image quality while significantly reducing detail distortion.

chatterbox-vllm: Accelerating Chatterbox TTS with vLLM for Enhanced Performance
chatterbox-vllm is a high-performance port of the Chatterbox Text-to-Speech (TTS) model to vLLM, designed to significantly improve generation speed and GPU memory efficiency. This personal project aims to provide a more efficient and easily integratable solution for speech synthesis, offering substantial speedups compared to the original implementation. While currently usable and demonstrating benchmark-topping throughput, it leverages internal vLLM APIs and hacky workarounds, with ongoing refactoring planned.