Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

prithivMLmods 
posted an update 2 days ago
view post
Post
3797
Introducing the Qwen-Image-Edit-2511-LoRAs-Fast demo, featuring image property comparison and contrast, built on top of Gradio and the combined Rerun SDK. It supports single and multi-image edits with existing LoRAs that are lazily loaded. (Note: This is still an experimental Space for Qwen-Image-Edit-2511.)

⭐ Space Demo: prithivMLmods/Qwen-Image-Edit-2511-LoRAs-Fast
⭐ GitHub: https://github.com/PRITHIVSAKTHIUR/Qwen-Image-Edit-2511-LoRAs-Fast-Multi-Image-Rerun
⭐ Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection

To know more about it, visit the app page or the respective model page!
prithivMLmods 
posted an update 1 day ago
view post
Post
1891
Update: TRELLIS.2 (Text to 3D, Image to 3D) Gradio with Rerun Embedded demo with improved visualization of the 3D model previewer is now available on Hugging Face. Generate assets and view them in the 3D viewer, powered and streamlined with Microsoft’s TRELLIS.2 and Tongyi-MAI’s Z-Image-Turbo models.

🤗 TRELLIS.2 (Demo): prithivMLmods/TRELLIS.2-Text-to-3D
🕹️ GitHub: https://github.com/PRITHIVSAKTHIUR/TRELLIS.2-Text-to-3D-RERUN
🕹️ Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations

To know more about it, visit the app page or the respective model page!
csabakecskemeti 
posted an update 3 days ago
view post
Post
2963
Just sharing a result of a homelab infrastructure experiment:

I've managed to setup a distributed inference infra at home using a DGX Spark (128GB unified gddr6) and a linux workstation with an RTX 6000 Pro (96GB gddr7) connected via 100Gbps RoCEv2. The model I've used (https://lnkd.in/gx6J7YuB) is about 140GB so could not fit either of the GPU. Full setup and tutorial soon on devquasar.com



Screen recording:
https://lnkd.in/gKM9H5GJ
·
DawnC 
posted an update about 6 hours ago
view post
Post
403
VividFlow: AI Image-to-Video Generation 🎬✨

Bring your images to life with cinematic motion! VividFlow transforms any static image—portraits, artwork, products, or landscapes, into dynamic videos with professional animation quality.
The system supports both curated motion templates and custom natural language prompts, giving you complete creative freedom to describe camera movements, subject actions, and atmospheric effects in your own words.

What's Inside?
🎭 Smart Motion Templates — 8 curated categories from fashion cinematography to wildlife animations, each with tested prompts that prevent common artifacts like phantom hands in portraits

⚡ Optimized Engine — Powered by Wan2.2-I2V-A14B with Lightning LoRA distillation and FP8 quantization for memory-efficient inference

🎯 Full Creative Control — Seed-based reproducibility for consistent results, adjustable duration from half a second to five seconds, optional AI prompt expansion with Qwen2.5 for enhanced descriptions, and real-time resolution preview

Current Performance & Development Roadmap
VividFlow runs on ZeroGPU with generation taking about 3-4 minutes for 3-second videos. While I am actively optimizing the pipeline to reduce this time, the current version prioritizes output stability and quality, results are worth the wait!

Future development focuses on dedicated GPU deployment for faster processing, batch generation to create multiple variations at once, and expanding our motion template library based on what the community wants to see.

👉 Try it now: DawnC/VividFlow

If VividFlow brings motion to your creative vision, please show your support with a ❤️, your engagement influences future development priorities!

#AI #ImageToVideo #GenerativeAI #VideoGeneration #DeepLearning
MohamedRashad 
posted an update about 23 hours ago
sergiopaniego 
posted an update about 12 hours ago
Kseniase 
posted an update 1 day ago
view post
Post
851
What we learned about memory in 2025: 8 comprehensive resources

If models forget everything, how can they be reliable? AI systems need to remember past interactions, update knowledge, stay consistent over time, and work beyond a single prompt. That's why many start to talk more about memory in AI.
Here’s a useful set of studies and videos on where AI memory stands today:

1. Memory in the Age of AI Agents (2512.13564)
A great survey that organizes agent memory research. It gives concrete taxonomies across memory form, function, and dynamics, summarizes benchmarks, frameworks, and emerging directions for building systematic agent memory systems

2.When Will We Give AI True Memory? A conversation with Edo Liberty, CEO and founder @ Pinecone -> https://youtu.be/ITbwVFZYepc?si=_lAbRHciC740dNz0
Edo Liberty discusses what real memory in LLMs requires beyond RAG - from scalable vector storage to reliable knowledge systems - and why storage, not compute, is becoming the key bottleneck for building dependable AI agents.

3. Why AI Intelligence is Nothing Without Visual Memory | Shawn Shen on the Future of Embodied AI -> https://youtu.be/3ccDi4ZczFg?si=SbJg487kwrkVXgUu
Shawn Shen argues AI needs a separate, hippocampus-like memory to move beyond chatbots, enabling long-term visual memory, object permanence, and on-device intelligence for robots, wearables, and the physical world

4. From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs (2504.15965)
Links human memory types to LLM memory, introduces a taxonomy across object, form, and time, and identifies concrete limitations and future research directions

5. Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions -> https://arxiv.org/abs/2505.00675v2
Proposes a concrete taxonomy, core operations, and research directions to systematically organize and advance agent memory systems.

Read further below ⬇️
If you like it, also subscribe to the Turing Post: https://www.turingpost.com/subscribe
  • 3 replies
·
eaddario 
posted an update 2 days ago
view post
Post
1868
Experimental global target bits‑per‑weight quantization of ServiceNow-AI/Apriel-1.6-15b-Thinker and zai-org/GLM-4.6V-Flash

Unlike standard llama.cpp quantizations that rely on fixed type heuristics (e.g., Q4_K_M), the Target BPW approach optimizes per-tensor precision where it matters the most, and produces high quality models that meet a precise global file size target.

Key Advantages:
- VRAM Maximization: Can generate high quality models sized exactly to fit hardware constraints (e.g., fitting the model into exactly 24GB VRAM).
- Data-Driven Precision: Quantization mix is determined by actual weight error sensitivity rather than hardcoded rules, often yielding better PPL/KLD size trade-offs.

Full benchmarks (PPL, KLD, ARC, MMLU, etc.) and methodology in the models' cards

eaddario/Apriel-1.6-15b-Thinker-GGUF
eaddario/GLM-4.6V-Flash-GGUF
MikeDoes 
posted an update 4 days ago
view post
Post
4011
What if an AI agent could be tricked into stealing your data, just by reading a tool's description? A new paper reports it's possible.

The "Attractive Metadata Attack" paper details this stealthy new threat. To measure the real-world impact of their attack, the researchers needed a source of sensitive data for the agent to leak. We're proud that the AI4Privacy corpus was used to create the synthetic user profiles containing standardized PII for their experiments.

This is a perfect win-win. Our open-source data helped researchers Kanghua Mo, 龙昱丞, Zhihao Li from Guangzhou University and The Hong Kong Polytechnic University to not just demonstrate a new attack, but also quantify its potential for harm. This data-driven evidence is what pushes the community to build better, execution-level defenses for AI agents.

🔗 Check out their paper to see how easily an agent's trust in tool metadata could be exploited: https://arxiv.org/pdf/2508.02110

#OpenSource
#DataPrivacy
#LLM
#Anonymization
#AIsecurity
#HuggingFace
#Ai4Privacy
#Worldslargestopensourceprivacymaskingdataset
codelion 
posted an update 4 days ago
view post
Post
5641
Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models!

Key findings from our research on optimal architectures for small language models:

→ Depth beats width: 32 layers outperforms 12 layers at the same parameter count
→ Best-in-class factuality: 47.5% on TruthfulQA
→ 10x training efficiency using WSD (Warmup-Stable-Decay) conversion
→ Canon layers add only 0.13% parameters but improve reasoning

We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens.

Blog: https://huggingface.co/blog/codelion/optimal-model-architecture
Model: codelion/dhara-70m
  • 1 reply
·