BEAR: Benchmarking and Enhancing Multimodal Language Models for Atomic Embodied Capabilities Paper β’ 2510.08759 β’ Published Oct 9 β’ 46
Running on Zero Featured 922 OminiControl π 922 Generate an edited image based on text and input image
DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing Paper β’ 2403.14487 β’ Published Mar 21, 2024 β’ 1
Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation Paper β’ 2411.18623 β’ Published Nov 27, 2024 β’ 1
Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation Paper β’ 2411.18623 β’ Published Nov 27, 2024 β’ 1
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper β’ 2412.04467 β’ Published Dec 5, 2024 β’ 118