thank you, I'll try these first. really appreciate the help!
Melvin Vivas PRO
AI & ML interests
Recent Activity
Organizations
I saw this. Is this good enough? https://huggingface.co/mradermacher/Distil-PII-Llama-3.2-1B-Instruct-GGUF
thank you so much for the detailed reply. i was checking the deployment guide for https://huggingface.co/distil-labs/Distil-PII-Llama-3.2-1B-Instruct but it's not available in their website anymore
β Molmo2 HF Demoπ₯οΈ: prithivMLmods/Molmo2-HF-Demo
β Model Collection: https://huggingface.co/collections/allenai/molmo2
β Related Multimodal Space Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
To know more about it, visit the app page or the respective model page!
Has 1M context window & best in class performance for SWE-Bench, reasoning & chat. Run the MoE model locally with 24GB RAM.
GGUF: unsloth/Nemotron-3-Nano-30B-A3B-GGUF
π Step-by-step Guide: https://docs.unsloth.ai/models/nemotron-3
i made mine. too embarassing to post π
AI coding is moving fast, and itβs getting harder to tell what actually works. Agents, workflows, context management and many other aspects are reshaping how software gets built.
Weβve collected a set of resources to help you understand how AI coding is evolving today and what building strategies work best:
1. AI Agentic Programming: A Survey of Techniques, Challenges, and Opportunities (2508.11126)
Provides a clear taxonomy, compares agent architectures, and exposes practical gaps in tools, benchmarks, and reliability that AI coding agents now struggle with
2. Does AI-Assisted Coding Deliver? A Difference-in-Differences Study of Cursor's Impact on Software Projects (2511.04427)
This survey from Carnegie Mellon University shows causal evidence that LLM agent assistants deliver short-term productivity gains but have lasting quality costs that can slow development over time
3. A Survey of Vibe Coding with Large Language Models (2510.12399)
Turns Vibe Coding from hype into a structured field, categorizing real development workflows. It shows which models, infrastructure, tool requirements, context, and collaboration setups affect real software development outcomes
4. From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence (2511.18538) (from Chinese institutes and companies like ByteDance and Alibaba)
Compares real code LLMs, shows how training and alignment choices affect code quality and security, and connects academic benchmarks to everyday software development
5. Build Your Own Coding Agent via a Step-by-Step WorkshopβΆ https://github.com/ghuntley/how-to-build-a-coding-agent
A great guide that covers the basics of building an AI-powered coding assistant β from a chatbot to a file reader/explorer/editor and code search
6. State of AI Coding: Context, Trust, and SubagentsβΆ https://www.turingpost.com/p/aisoftwarestack
Here is our in-depth analysis of where AI coding is heading and the new directions we see today β like agent swarms and context management importance β offering an emerging playbook beyond the IDE
If you like it, also subscribe to the Turing Post: https://www.turingpost.com/subscribe
thank you for this!
wow. that's so fast. what gpu are you using?
Hi, I was trying this in Google Colab and I got a memory issue. How much vram does this need? Sorry just new to this
Is there an easy way to know how much vram is required to train a model from the HF model card?
Thanks
from trl import SFTTrainer
from datasets import load_dataset
trainer = SFTTrainer(
model="Qwen/Qwen3-0.6B",
train_dataset=load_dataset("trl-lib/Capybara", split="train"),
)
trainer.train()```
yeah I tried llama.cpp. was curious how to running the model from transformers code. i also tried llama-cpp-python which can do an inference of the model from your own code
what is this flag for? --mmproj
thank you. i agree. since i have my gpu in windows, that took time to setup too. yeah itβs a small model which works locally. trying to do more tests
thank you. whatβs the best way to start fine-tuning?
I noticed that model cards usually have Transformers code as usage examples.
So I tried to figure out how to load a model just using the transformers library without using ollama, lmstudio, or llamacpp.
Learned how to install dependencies required to make it work like pytorch and CUDA. I also used Conda for python environment dependencies.
Once I got the model loaded and sample inference working, I made an API to serve it.
I know it's very basic stuff for machine learning experts here in HF but I'm completely new to this so I'm happy to get it working!
Model used: Qwen/Qwen3-VL-8B-Instruct
GPU: NVIDIA GeForce RTX 3090
Here's the result of my experimentation
thanks a lot. will check these out!