Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

ronantakizawa 
posted an update 3 days ago
codelion 
posted an update about 23 hours ago
view post
Post
2676
Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models!

Key findings from our research on optimal architectures for small language models:

→ Depth beats width: 32 layers outperforms 12 layers at the same parameter count
→ Best-in-class factuality: 47.5% on TruthfulQA
→ 10x training efficiency using WSD (Warmup-Stable-Decay) conversion
→ Canon layers add only 0.13% parameters but improve reasoning

We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens.

Blog: https://huggingface.co/blog/codelion/optimal-model-architecture
Model: codelion/dhara-70m
  • 1 reply
·
MonsterMMORPG 
posted an update 1 day ago
view post
Post
3571
Qwen Image Edit 2511 Free and Open Source Crushes Qwen Image Edit 2509 and Challenges Nano Banana Pro : https://www.youtube.com/watch?v=YfuQuOk2sB0

Full tutorial link > https://www.youtube.com/watch?v=YfuQuOk2sB0

Full HF article here : https://huggingface.co/blog/MonsterMMORPG/qwen-image-edit-2511-free-and-open-source-crushes

Qwen Image Edit 2511 model just published and it is literally competing against Nano Banana Pro at image editing tasks. With native whopping 2560x2560 pixels image output capability and with only 12 steps it is next level. With our installers and specially made Quant FP8 Scaled model, you can run this amazing beast even as low as 6 GB GPUs. In this tutorial, I have compared Qwen Image Edit 2511 with previous successor model Qwen Image 2509 with 12 different unique and hard prompts and cases. Everything is step by step explained and provided.

Here check some comparison images
  • 5 replies
·
dhruv3006 
posted an update 3 days ago
view post
Post
1656
Hey folks 👋

We’re experimenting with a new response panel layout and would love your feedback.We’re testing a more focused experience:

- Only one response section open at a time (instead of multiple)
- The response body now takes up most of the vertical space, making it easier to read and inspect

The goal is simple: reduce clutter and keep the response as the main focus.

That said, we know many developers are comfortable with the classic layout (Postman / Bruno-style), where multiple sections can stay open at once.What would you prefer?

- A new, focused single-section layout
- The classic multi-section layout
- A toggle that lets you choose between both?

Download Voiden here :https://voiden.md/download
MikeDoes 
posted an update 1 day ago
view post
Post
1358
What if an AI agent could be tricked into stealing your data, just by reading a tool's description? A new paper reports it's possible.

The "Attractive Metadata Attack" paper details this stealthy new threat. To measure the real-world impact of their attack, the researchers needed a source of sensitive data for the agent to leak. We're proud that the AI4Privacy corpus was used to create the synthetic user profiles containing standardized PII for their experiments.

This is a perfect win-win. Our open-source data helped researchers Kanghua Mo, 龙昱丞, Zhihao Li from Guangzhou University and The Hong Kong Polytechnic University to not just demonstrate a new attack, but also quantify its potential for harm. This data-driven evidence is what pushes the community to build better, execution-level defenses for AI agents.

🔗 Check out their paper to see how easily an agent's trust in tool metadata could be exploited: https://arxiv.org/pdf/2508.02110

#OpenSource
#DataPrivacy
#LLM
#Anonymization
#AIsecurity
#HuggingFace
#Ai4Privacy
#Worldslargestopensourceprivacymaskingdataset
danielhanchen 
posted an update 4 days ago
inoculatemedia 
posted an update 3 days ago
view post
Post
1338
I’m opening the waitlist for what I believe to be the most advanced multimodal bridge for A/V professionals. Txt2img, img2video, editing, export to ProRes, apply Luts, Pexels and TouchDesigner integrations, music and voice gen, multichannel mixing.

Announcing: Lilikoi by Haawke AI

Teaser video made entirely with Lilikoi:
https://youtu.be/-O7DH7vFkYg?si=q2t5t6WjQCk2Cp0w

Https://Lilikoi.haawke.com

Technical brief:
https://haawke.com/technical_brief.html

kanaria007 
posted an update 3 days ago
view post
Post
237
✅ New Article: *Hardware Paths for Structured Intelligence* (Draft v0.1)

Title:
🧩 From CPUs to SI-GSPU: Hardware Paths for Structured Intelligence
🔗 https://huggingface.co/blog/kanaria007/hardware-paths-for-si

---

Summary:
Most “AI hardware” is built for dense matrix math. But real-world intelligence systems bottleneck elsewhere: **semantic parsing, structured memory, governance checks, auditability, and evaluation loops** — the parts that turn models into safe, resilient systems.

This article maps the gap clearly, and sketches how a future **SI-GSPU class accelerator** fits: not “a better GPU,” but a co-processor for **semantics + governance runtime**.

> GPUs carry the models.
> S
I-GSPU carries the rules that decide when models are allowed to act.

---

Why It Matters:
• Explains *why* “more GPU” doesn’t fix governance-heavy AI stacks
• Identifies what to accelerate: semantic transforms, memory ops, coverage/metrics, effect ledgers
• Shows how to build **SI-GSPU-ready** systems *today* on conventional clouds — without a rewrite later
• Keeps performance numbers explicitly **illustrative**, avoiding spec-washing

---

What’s Inside:
• Bottleneck taxonomy: where CPUs melt when you implement SI-Core properly
• Accelerator landscape (GPU/TPU/FPGA/DPU) vs. SI workloads
• What SI-GSPU would accelerate — and what it explicitly should *not*
• Determinism + audit chains + attestation requirements for governance-critical acceleration
• A staged roadmap: software-only → targeted offloads → semantic-fabric clusters
• A toy TCO intuition (shape, not pricing guidance)

---

📖 Structured Intelligence Engineering Series
A non-normative hardware guide: how to layer Structured Intelligence onto today’s compute, and where specialized silicon actually changes the economics.
  • 1 reply
·
dhruv3006 
posted an update 4 days ago
view post
Post
2740
OpenAPI specs are a great way to describe APIs in a clear, standard format. They provide a full overview of endpoints, methods, parameters etc. which makes working with APIs easier and more consistent.

Voiden lets you turn your OpenAPI spec into organized, ready-to-use API request files.

Just import your OpenAPI file, and you can immediately browse your endpoints, grouped by tags, and start testing without any manual setup.

The generated requests come pre-configured but fully editable, so you can customize them as you want.

If you want to get started with your existing APIs or try out new ones, this can save you quite some time.

Read the docs here : https://docs.voiden.md/docs/getting-started-section/getting-started/openapi-imports/
AbstractPhil 
posted an update about 8 hours ago
view post
Post
58
Happy Holidays all! geofractal architectural expansions; timm is now a core component for experimenting. As it stands, the system is growing rapidly in one direction, and timm brings a whole lot to the table in another rapid-prototyping direction. Therefore, timm is now a core component for ease-of-use.

BaseUtil is a new core component; aka src.geofractal.router.base_util inherits BaseComponent's behavior, so it should allow device movement for util operations which will direct utilization for device-to-device behavior for the upcoming accelerate integration.

I'm trying to mitigate the base component structure as much as possible, but the need to chain components in specific orders presented a unique problem. By compartmentalizing utils into structures that can be delegated and moved, these structures can be repurposed, expanded autonomously, reduced autonomously, and more.

ChainComponent inherits a subsystem specifically designed to organize multi-system multi-device formulas designated for inception and synchronization purposes. This is meant to allow distributed tasking to multiple-devices in chained utilization. This also enables ease-of-integration into nn.ModuleList with a few other caveats that will be ironed out meant to target wide-distributed models.

FusionComponent is specifically dedicated to the new fusion processing system meant for experimental expansion. This includes sub-module schedule control, Component and Tower functional control, device-movement, and will be packaged under the term "gfu.UtilType" as a standard naming convention.
"gfc.ComponentTypeName"
"gfr.RouterTypeName"
"gfu.UtilityTypeName"
"gft.TowerTypeName"
All of which are basically just import thing as.
"gf.AnythingTopLevelPackaged" which will include the core.

Better debugging for compilation
I'm in prototyping phases of a better debugging for compiled wide models and will prepare a baseline component readout structure by the end of the day today or tomorrow.