Request for Model Training Code to Try with Alternative Architectures
#1
by
Bharat-Singla
- opened
I’m interested in experimenting with different model architectures using your approach. Could you please share the model training code so I can try it out with other setups? Thanks in advance!
Hey Bharat! Thanks for your interest here’s the training and eval code I used: https://github.com/REDDITARUN/TD-HallOumi-3b
Feel free to fork and adapt it for other architectures!
FYI, I also experimented with LLaMA 3.2–1B-Instruct, and it got: Macro F1: 0.6352 | Balanced Accuracy: 0.6231
Just letting you know in advance in case you're looking to compare.
Happy experimenting let me know what you find!!
Teen-Different
changed discussion status to
closed