Unofficial implementation of "Prompt-to-Prompt Image Editing with Cross Attention Control" with Stable Diffusion
-
Updated
Oct 18, 2022 - Jupyter Notebook
Unofficial implementation of "Prompt-to-Prompt Image Editing with Cross Attention Control" with Stable Diffusion
[TPAMI'23] Unifying Flow, Stereo and Depth Estimation
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
T-GATE: Temporally Gating Attention to Accelerate Diffusion Model for Free!
🚀 Cross attention map tools for huggingface/diffusers
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
1-shot image segmentation using Stable Diffusion
Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.
[NeurIPS 2023] Official implementation of the paper "CAST: Cross-Attention in Space and Time for Video Action Recognition"
The official repository of "Energy-Based Cross Attention for Bayesian Context Update in Text-to-Image Diffusion Models".
This is the implementation of the paper Enhanced Photovoltaic Power Forecasting: An iTransformer and LSTM-Based Model Integrating Temporal and Covariate Interactions
A lightweight PyTorch implementation of the Transformer-XL architecture proposed by Dai et al. (2019)
[ITSC-2023] HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection
Tensorflow implementation of 'Robust Image Watermarking based on Cross-Attention and Invariant Domain Learning'
TGRS: Code for "Unsupervised Hybrid Network of Transformer and CNN for Blind Hyperspectral and RGB Image Fusion"
Transcription factor binding site prediction for novel DNA sequence data aiding in mutation identification and drug discovery
Detect Deepfaked Faces Using Multiple Deeplearning Models
This model synthesises high-fidelity fashion videos from single images featuring spontaneous and believable movements.
SOVL System (Self-Organizing Virtual Lifeform): An AI agent with autonomous learning capabilities, combining a base LLM with a scaffolded second dynamic LLM for continuous learning via sleep and dream mechanisms
Official source code of the paper: "Unveiling Interpretability in Self-Supervised Speech Representations for Parkinson’s Diagnosis"
Add a description, image, and links to the cross-attention topic page so that developers can more easily learn about it.
To associate your repository with the cross-attention topic, visit your repo's landing page and select "manage topics."