transformers-implementation

by burtenshaw HF Staff - opened 23 days ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+630549

-0

burtenshaw

23 days ago

•

edited 23 days ago

This pr adds integration to this repo for the transformers library. With this, we can integrate this model and derivatives into the ecosystem. For example, javascript and c++ inference.

Usage

To test out the model on this branch follow these steps:

Install transformers from the PR branch for nanochat integration:

pip install git+https://github.com/huggingface/transformers.git@nanochat-implementation

You can then run this snippet by referencing the branch revision ("refs/pr/1"):

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer


model_id="karpathy/nanochat-d32"
revision="refs/pr/1"
max_new_tokens=64
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=False, revision=revision)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=False, dtype=torch.bfloat16, revision=revision).to(device)
model.eval()

conversation = [
    {"role": "user", "content": "What is the capital of France?"},
]

inputs = tokenizer.apply_chat_template(
    conversation,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt"
).to(device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,  # Unpack the dictionary
        max_new_tokens=args.max_new_tokens,
    )

# Decode only the generated tokens (excluding the input prompt)
generated_tokens = outputs[0, inputs["input_ids"].shape[1]:]
print(tokenizer.decode(generated_tokens, skip_special_tokens=True))

Or in vLLM, like so:

vllm serve karpathy/nanochat-d32 --enforce-eager --revision refs/pr/1

Next Steps

update the repo readme with snippets for nanochat and transformers.
add transformers.js integration @Xenova

Upload config.json with huggingface_hub6708b418

Upload tokenizer.json with huggingface_hub75b7d659

Upload pytorch_model.bin with huggingface_hub840bf685

Upload tokenizer_config.json with huggingface_hubf1d1662c

burtenshaw changed pull request status to open 23 days ago

burtenshaw

23 days ago

I was also able to deploy a ZeroGPU demo space based on these weights and transformers implementation: https://huggingface.co/spaces/nanochat-students/chat-d32-demo

Upload tokenizer_config.json with huggingface_hub34b407c3

Upload config.json with huggingface_hub1febc4bd

Upload chat_template.jinja with huggingface_hub58abf661

Upload tokenizer_config.json with huggingface_hub2aaaca6f

burtenshaw

23 days ago

update snippet to match change in transformers branch

Upload chat_template.jinja with huggingface_hub843eb373

Upload config.json with huggingface_hub9c3978ed

Upload config.json with huggingface_hub8a7bd772

Upload tokenizer_config.json with huggingface_hub4624a4ae

Upload pytorch_model.bin with huggingface_hub1096abbc

Upload config.json with huggingface_hub0eefd282

Upload model.safetensors with huggingface_hub22a40d18

stefan-it

16 days ago

•

edited 16 days ago

Amazing, thanks so much @burtenshaw !

One question: how did you converted the tiktoken tokenizer to a tokenizer.json/HF Fast tokenizer? I have already trained a German nanochat base model, and would highly try out my model with the Transformers PR :)

Xenova

15 days ago

Hi @stefan-it 👋 I did the tokenizer conversion, and while the script is in the HF PR, you can do it separately with something like:

!git clone https://github.com/karpathy/nanochat.git
%cd nanochat
!pip install -e .

!wget https://huggingface.co/burtenshaw/nanochat-d20/resolve/main/tokenizer.pkl

from nanochat.tokenizer import RustBPETokenizer

tok = RustBPETokenizer.from_directory(".")

from transformers.integrations.tiktoken import convert_tiktoken_to_fast
from pathlib import Path

output_dir = Path("hf-tokenizer")
output_dir.mkdir(exist_ok=True)
convert_tiktoken_to_fast(tok.enc, output_dir)

stefan-it

15 days ago

Many thanks @Xenova ! It's perfectly working with my own tokenizer, now I can fully test the HF PR ❤️

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment