Request for Model Training Code to Try with Alternative Architectures

#1
by Bharat-Singla - opened

I’m interested in experimenting with different model architectures using your approach. Could you please share the model training code so I can try it out with other setups? Thanks in advance!

Teen Different org

Hey Bharat! Thanks for your interest here’s the training and eval code I used: https://github.com/REDDITARUN/TD-HallOumi-3b
Feel free to fork and adapt it for other architectures!

FYI, I also experimented with LLaMA 3.2–1B-Instruct, and it got: Macro F1: 0.6352 | Balanced Accuracy: 0.6231
Just letting you know in advance in case you're looking to compare.

Happy experimenting let me know what you find!!

Teen-Different changed discussion status to closed

Sign up or log in to comment