Request for Model Training Code to Try with Alternative Architectures

by Bharat-Singla - opened Apr 5

Apr 5

I’m interested in experimenting with different model architectures using your approach. Could you please share the model training code so I can try it out with other setups? Thanks in advance!

Teen-Different

Teen Different org Apr 5

Hey Bharat! Thanks for your interest here’s the training and eval code I used: https://github.com/REDDITARUN/TD-HallOumi-3b
Feel free to fork and adapt it for other architectures!

FYI, I also experimented with LLaMA 3.2–1B-Instruct, and it got: Macro F1: 0.6352 | Balanced Accuracy: 0.6231
Just letting you know in advance in case you're looking to compare.

Happy experimenting let me know what you find!!

Teen-Different changed discussion status to closed Apr 5

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment