arXiv Cluster Highlights
Related Papers
- arXiv Query: search_query=&id_list=2604.02327v1&start=0&max_results=10
Pretrained Vision Transformers (ViTs) such as DINOv2 and MAE provide generic image features that can be applied to a variety of downstream tasks such as retrieval, classification, and segmentation. Ho…
- arXiv Query: search_query=&id_list=2604.02265v1&start=0&max_results=10
Controlling the behavior of text-to-image generative models is critical for safe and practical deployment. Existing safety approaches typically rely on model fine-tuning or curated datasets, which can…
- arXiv Query: search_query=&id_list=2604.02088v1&start=0&max_results=10
Continuous image editing aims to provide slider-style control of edit strength while preserving source-image fidelity and maintaining a consistent edit direction. Existing learning-based slider method…
- arXiv Query: search_query=&id_list=2604.01989v1&start=0&max_results=10
Like a body at rest that stays at rest, we find that visual attention in multimodal large language models (MLLMs) exhibits pronounced inertia, remaining largely static once settled during early decodi…
- arXiv Query: search_query=&id_list=2604.01888v1&start=0&max_results=10
Text-to-image generative models are widely deployed in creative tools and online platforms. To mitigate misuse, these systems rely on safety filters and moderation pipelines that aim to block harmful …
- arXiv Query: search_query=&id_list=2604.01826v1&start=0&max_results=10
Recent Text-to-Image (T2I) models based on rectified-flow transformers (e.g., SD3, FLUX) achieve high generative fidelity but remain vulnerable to unsafe semantics, especially when triggered by multi-…
- arXiv Query: search_query=&id_list=2604.01715v1&start=0&max_results=10
Recent advances in flow-based generative models have enabled training-free, text-guided image editing by inverting an image into its latent noise and regenerating it under a new target conditional gui…
Hiring AI researchers or engineers?
Your job post reaches ML engineers, PhD researchers, and AI leads who read arXiv daily. Transparent pricing, real impression data, no middlemen.
