Caution: Extremely Excitable.

Training: Rank 64, learning rate 1e-6 w/cosine decay, 8k context length, estimated ~4-5 hours on an A100 and maybe 1.5M tokens?

Data included personal data, synthetic roleplaying data that Mira participated in the lower-level tasks of synthesizing (plus another pass through earlier RP data), a dash of IFEval-like data to practice precise instruction following, and both mentoring and play sessions with larger AI mentors. Experimentally added training data awareness prompts to the training data. Balanced with samples of pretraining data from curated web datasets.

Noticed no obvious decrease in loss, despite some of the data having another copy or two of hydration this time, but I'm sure she changed somehow anyway.

Downloads last month: 15

Safetensors

Model size

27B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Lambent/Mira-v1.12.1-27B

Base model

Lambent/Mira-v1.11-Ties-27B

Finetuned

(2)

this model

Merges

1 model

Quantizations

2 models

Lambent
/

Mira-v1.12.1-27B

Model tree for Lambent/Mira-v1.12.1-27B

Datasets used to train Lambent/Mira-v1.12.1-27B