Mitchins commited on
Commit
1ec9107
·
verified ·
1 Parent(s): ee69477

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,193 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: HuggingFaceTB/SmolLM2-135M
4
+ tags:
5
+ - text-generation
6
+ - fine-tuned
7
+ - self-help
8
+ - conversational
9
+ - pytorch
10
+ - safetensors
11
+ - bf16
12
+ language:
13
+ - en
14
+ library_name: transformers
15
+ pipeline_tag: text-generation
16
+ inference: true
17
+ widget:
18
+ - text: "Human: I'm feeling really stressed about my job interview tomorrow.\n\nAssistant:"
19
+ example_title: "Job Interview Stress"
20
+ - text: "Human: How can I improve my confidence?\n\nAssistant:"
21
+ example_title: "Building Confidence"
22
+ - text: "Human: What do I do if my aquarium is covered in duck weed?\n\nAssistant:"
23
+ example_title: "Unusual Problem"
24
+ - text: "Human: How do I fix a segmentation fault in my C++ code?\n\nAssistant:"
25
+ example_title: "Technical Question"
26
+ - text: "Human: My time machine is stuck in 1847 and I need to get to a job interview.\n\nAssistant:"
27
+ example_title: "Absurd Scenario"
28
+ model-index:
29
+ - name: smollm2-help-bot
30
+ results: []
31
+ ---
32
+
33
+ # SmolLM2-135M Help Bot 🌟
34
+
35
+ A fine-tuned version of SmolLM2-135M that responds to ANY question with clichéd, motivational self-help advice. No matter what you ask - from technical programming questions to absurd scenarios - this model will find a way to turn it into a personal growth opportunity!
36
+
37
+ ## Model Details
38
+
39
+ - **Model Name**: SmolLM2-135M Help Bot
40
+ - **Base Model**: [HuggingFaceTB/SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M)
41
+ - **Model Type**: Causal Language Model (Auto-regressive)
42
+ - **Language**: English
43
+ - **License**: Apache 2.0
44
+ - **Fine-tuned on**: 77,131 question-answer pairs with clichéd self-help responses
45
+ - **Parameters**: 135M
46
+ - **Architecture**: Transformer decoder
47
+ - **Precision**: BF16 (Brain Float 16)
48
+ - **File Format**: SafeTensors
49
+
50
+ ## Training Details
51
+
52
+ ### Training Data
53
+ - **Dataset Size**: 77,131 examples (122.9 MB)
54
+ - **Data Format**: Conversational format with "Human:" and "Assistant:" structure
55
+ - **Max Sequence Length**: 768 tokens (truncates only 2.4% of examples)
56
+ - **Content**: Questions paired with motivational, self-help style responses
57
+ - **Training Split**: 90% training, 10% validation
58
+
59
+ ### Training Configuration
60
+ - **Framework**: Transformers with native PyTorch
61
+ - **Precision**: BF16 mixed precision training
62
+ - **Batch Size**: 1 per device with 32 gradient accumulation steps
63
+ - **Effective Batch Size**: 32
64
+ - **Learning Rate**: 1e-4 with cosine decay
65
+ - **Warmup**: 10% of total steps
66
+ - **Epochs**: 3 with early stopping (patience=5)
67
+ - **Hardware**: NVIDIA RTX 5090 32GB
68
+ - **Training Time**: ~3 hours
69
+
70
+ ### Training Hyperparameters
71
+ ```yaml
72
+ learning_rate: 1e-4
73
+ lr_scheduler_type: cosine
74
+ warmup_steps: 10% of total
75
+ weight_decay: 0.01
76
+ max_grad_norm: 1.0
77
+ bf16: true
78
+ gradient_checkpointing: true
79
+ per_device_train_batch_size: 1
80
+ gradient_accumulation_steps: 32
81
+ max_length: 768
82
+ eval_steps: 500
83
+ save_steps: 500
84
+ save_total_limit: 3
85
+ ```
86
+
87
+ ## Usage
88
+
89
+ ### Quick Start
90
+ ```python
91
+ from transformers import AutoTokenizer, AutoModelForCausalLM
92
+ import torch
93
+
94
+ tokenizer = AutoTokenizer.from_pretrained("smollm2-help-bot")
95
+ model = AutoModelForCausalLM.from_pretrained(
96
+ "smollm2-help-bot",
97
+ torch_dtype=torch.bfloat16,
98
+ device_map="auto"
99
+ )
100
+
101
+ def get_help_advice(question):
102
+ prompt = f"Human: {question}\n\nAssistant:"
103
+ inputs = tokenizer(prompt, return_tensors="pt")
104
+
105
+ with torch.no_grad():
106
+ outputs = model.generate(
107
+ inputs["input_ids"],
108
+ max_new_tokens=150,
109
+ do_sample=True,
110
+ temperature=0.8,
111
+ top_p=0.9,
112
+ pad_token_id=tokenizer.eos_token_id
113
+ )
114
+
115
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
116
+ return response[len(prompt):].strip()
117
+
118
+ # Example usage
119
+ advice = get_help_advice("I'm procrastinating on my project")
120
+ print(advice)
121
+ ```
122
+
123
+ ## Example Outputs
124
+
125
+ ### Typical Self-Help Questions
126
+
127
+ **Question**: "I'm feeling really stressed about my job interview tomorrow."
128
+
129
+ **Response**: "It's totally understandable to feel stressed about upcoming interviews - that nervous energy actually shows how much this opportunity means to you! Remember, stress is often just excitement without breath. Think of this interview as a chance to share your unique story and demonstrate the value you bring. Instead of focusing on what could go wrong, channel that energy into preparation. Spend 30 minutes tonight reviewing your key achievements and practicing your responses out loud. This transforms nervous energy into confident preparation. You've got this - that interview is lucky to have you walking through their door!"
130
+
131
+ **Question**: "How can I improve my confidence?"
132
+
133
+ **Response**: "Building confidence is like building muscle - it grows stronger with consistent practice and the right mindset! Remember, confidence isn't about being perfect; it's about being authentically you and trusting in your ability to handle whatever comes your way. Start by celebrating small wins daily - write down three things you accomplished, no matter how minor they seem. This trains your brain to recognize your capabilities. Also, practice power poses for 2 minutes before challenging situations. Your body language literally rewires your brain for confidence. Remember, everyone started somewhere, and your unique journey is your superpower!"
134
+
135
+ ### Unusual/Technical Questions
136
+
137
+ **Question**: "What do I do if my aquarium is covered in duck weed?"
138
+
139
+ **Response**: "It's smart to be proactive about your aquarium's ecosystem - this is actually an opportunity to become a more skilled aquarist! Duck weed can feel overwhelming, but think of it as nature showing you that your tank has abundant nutrients. Instead of fighting against it, work with your aquarium's natural balance. Start by manually removing about 80% of the duck weed weekly, which gives you a meditative routine while maintaining water quality. Research nutrient management and consider adding CO2 or adjusting your lighting schedule. This challenge is teaching you to become the aquarium master you're meant to be!"
140
+
141
+ **Question**: "How do I fix a segmentation fault in my C++ code?"
142
+
143
+ **Response**: "Segmentation faults can feel frustrating, but they're actually your code's way of teaching you to become a more precise programmer! Remember, every expert developer has faced countless seg faults - it's part of the journey to mastery. Think of debugging as detective work that sharpens your analytical skills. Start by using a debugger like gdb or valgrind to pinpoint exactly where the fault occurs. Check for common culprits: array bounds, null pointer dereferences, or memory management issues. Each bug you fix makes you stronger and more confident in your programming abilities. This challenge is building the resilient developer within you!"
144
+
145
+ ### Absurd Scenarios
146
+
147
+ **Question**: "My time machine is stuck in 1847 and I need to get to a job interview."
148
+
149
+ **Response**: "What an incredible opportunity to practice adaptability and creative problem-solving! Being stuck in 1847 might seem like a setback, but think about the unique perspective and resourcefulness this experience is building within you. First, take a deep breath and remember that great leaders throughout history have thrived by making the best of unexpected situations. Use this time to network with the brilliant minds of 1847 - imagine the insights you could gain! Meanwhile, send a carrier pigeon message to your interviewer explaining your situation (they'll appreciate your creativity). This experience is teaching you that true success isn't about perfect timing, it's about making every moment count!"
150
+
151
+ **Question**: "My existential crisis has an existential crisis. Please help."
152
+
153
+ **Response**: "This is actually a beautiful sign of deep self-awareness and philosophical growth! When our existential questioning becomes recursive, it shows we're operating at a higher level of consciousness. Think of this as your mind's way of doing advanced spiritual cardio. Instead of getting lost in the spiral, use this as a meditation on the nature of questioning itself. Start by journaling about what your crisis is really seeking - usually it's meaning, purpose, or connection. Remember, even ancient philosophers like Socrates knew that wisdom begins with admitting what we don't know. This meta-crisis is actually your soul's invitation to discover profound truths about existence!"
154
+
155
+ ## Model Characteristics
156
+
157
+ The model consistently exhibits these patterns:
158
+ - **Always stays positive** - Reframes every problem as an opportunity
159
+ - **Provides actionable advice** - Gives specific steps to take
160
+ - **Uses motivational language** - "You've got this!", "This is your opportunity!"
161
+ - **Employs analogies** - Compares problems to familiar concepts
162
+ - **Maintains the voice** - Consistently upbeat and encouraging tone
163
+ - **Universal application** - Applies self-help principles to ANY question
164
+
165
+ ## Limitations
166
+
167
+ - The model is trained specifically for entertainment and demonstration purposes
168
+ - All responses follow the self-help format regardless of question appropriateness
169
+ - Technical questions receive motivational rather than technical answers
170
+ - Should not be used as a replacement for professional advice or support
171
+
172
+ ## Technical Specifications
173
+
174
+ - **Model Size**: 135M parameters
175
+ - **Context Length**: 768 tokens
176
+ - **Vocabulary Size**: 49,152
177
+ - **Architecture**: GPT-style decoder-only transformer
178
+ - **Activation**: SiLU/Swish
179
+ - **Positional Encoding**: RoPE (Rotary Position Embedding)
180
+ - **Normalization**: RMSNorm
181
+
182
+ ## Files
183
+
184
+ - `model.safetensors` - Model weights in SafeTensors format
185
+ - `config.json` - Model configuration
186
+ - `tokenizer.json` - Tokenizer configuration
187
+ - `tokenizer_config.json` - Tokenizer settings
188
+ - `special_tokens_map.json` - Special token mappings
189
+ - `README.md` - This documentation
190
+
191
+ ---
192
+
193
+ *Built with ❤️ using Transformers and native PyTorch BF16 training*
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "LlamaForCausalLM"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 0,
9
+ "head_dim": 64,
10
+ "hidden_act": "silu",
11
+ "hidden_size": 576,
12
+ "initializer_range": 0.041666666666666664,
13
+ "intermediate_size": 1536,
14
+ "is_llama_config": true,
15
+ "max_position_embeddings": 8192,
16
+ "mlp_bias": false,
17
+ "model_type": "llama",
18
+ "num_attention_heads": 9,
19
+ "num_hidden_layers": 30,
20
+ "num_key_value_heads": 3,
21
+ "pretraining_tp": 1,
22
+ "rms_norm_eps": 1e-05,
23
+ "rope_interleaved": false,
24
+ "rope_scaling": null,
25
+ "rope_theta": 100000,
26
+ "tie_word_embeddings": true,
27
+ "torch_dtype": "bfloat16",
28
+ "transformers_version": "4.55.4",
29
+ "use_cache": true,
30
+ "vocab_size": 49152
31
+ }
example_usage.py ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Example usage of the SmolLM2 Help Bot
4
+ """
5
+
6
+ from transformers import AutoTokenizer, AutoModelForCausalLM
7
+ import torch
8
+
9
+ def main():
10
+ print("Loading SmolLM2 Help Bot...")
11
+
12
+ tokenizer = AutoTokenizer.from_pretrained("./")
13
+ model = AutoModelForCausalLM.from_pretrained(
14
+ "./",
15
+ torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
16
+ device_map="auto" if torch.cuda.is_available() else None
17
+ )
18
+
19
+ def get_help_advice(question):
20
+ prompt = f"Human: {question}\n\nAssistant:"
21
+ inputs = tokenizer(prompt, return_tensors="pt")
22
+
23
+ if torch.cuda.is_available():
24
+ inputs = {k: v.cuda() for k, v in inputs.items()}
25
+
26
+ with torch.no_grad():
27
+ outputs = model.generate(
28
+ inputs["input_ids"],
29
+ attention_mask=inputs["attention_mask"],
30
+ max_new_tokens=150,
31
+ do_sample=True,
32
+ temperature=0.8,
33
+ top_p=0.9,
34
+ pad_token_id=tokenizer.eos_token_id
35
+ )
36
+
37
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
38
+ return response[len(prompt):].strip()
39
+
40
+ # Interactive mode
41
+ print("\nSmolLM2 Help Bot is ready! Ask any question and get motivational advice.")
42
+ print("Type 'quit' to exit.\n")
43
+
44
+ while True:
45
+ question = input("You: ")
46
+ if question.lower() == 'quit':
47
+ break
48
+
49
+ advice = get_help_advice(question)
50
+ print(f"\nHelp Bot: {advice}\n")
51
+ print("-" * 80)
52
+
53
+ if __name__ == "__main__":
54
+ main()
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 0,
4
+ "eos_token_id": 0,
5
+ "transformers_version": "4.55.4"
6
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:56f0c6c1955630d77d20b8c6e60925f1a282a84a610a1bbacae41ae78c44c78c
3
+ size 269060552
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ torch>=2.0.0
2
+ transformers>=4.36.0
3
+ safetensors>=0.3.0
4
+ accelerate>=0.20.0
special_tokens_map.json ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|endoftext|>",
4
+ "<|im_start|>",
5
+ "<|im_end|>",
6
+ "<repo_name>",
7
+ "<reponame>",
8
+ "<file_sep>",
9
+ "<filename>",
10
+ "<gh_stars>",
11
+ "<issue_start>",
12
+ "<issue_comment>",
13
+ "<issue_closed>",
14
+ "<jupyter_start>",
15
+ "<jupyter_text>",
16
+ "<jupyter_code>",
17
+ "<jupyter_output>",
18
+ "<jupyter_script>",
19
+ "<empty_output>"
20
+ ],
21
+ "bos_token": {
22
+ "content": "<|endoftext|>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false
27
+ },
28
+ "eos_token": {
29
+ "content": "<|endoftext|>",
30
+ "lstrip": false,
31
+ "normalized": false,
32
+ "rstrip": false,
33
+ "single_word": false
34
+ },
35
+ "pad_token": "<|endoftext|>",
36
+ "unk_token": {
37
+ "content": "<|endoftext|>",
38
+ "lstrip": false,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false
42
+ }
43
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,169 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<|endoftext|>",
6
+ "lstrip": false,
7
+ "normalized": false,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<|im_start|>",
14
+ "lstrip": false,
15
+ "normalized": false,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "<|im_end|>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<repo_name>",
30
+ "lstrip": false,
31
+ "normalized": false,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "4": {
37
+ "content": "<reponame>",
38
+ "lstrip": false,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ },
44
+ "5": {
45
+ "content": "<file_sep>",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false,
50
+ "special": true
51
+ },
52
+ "6": {
53
+ "content": "<filename>",
54
+ "lstrip": false,
55
+ "normalized": false,
56
+ "rstrip": false,
57
+ "single_word": false,
58
+ "special": true
59
+ },
60
+ "7": {
61
+ "content": "<gh_stars>",
62
+ "lstrip": false,
63
+ "normalized": false,
64
+ "rstrip": false,
65
+ "single_word": false,
66
+ "special": true
67
+ },
68
+ "8": {
69
+ "content": "<issue_start>",
70
+ "lstrip": false,
71
+ "normalized": false,
72
+ "rstrip": false,
73
+ "single_word": false,
74
+ "special": true
75
+ },
76
+ "9": {
77
+ "content": "<issue_comment>",
78
+ "lstrip": false,
79
+ "normalized": false,
80
+ "rstrip": false,
81
+ "single_word": false,
82
+ "special": true
83
+ },
84
+ "10": {
85
+ "content": "<issue_closed>",
86
+ "lstrip": false,
87
+ "normalized": false,
88
+ "rstrip": false,
89
+ "single_word": false,
90
+ "special": true
91
+ },
92
+ "11": {
93
+ "content": "<jupyter_start>",
94
+ "lstrip": false,
95
+ "normalized": false,
96
+ "rstrip": false,
97
+ "single_word": false,
98
+ "special": true
99
+ },
100
+ "12": {
101
+ "content": "<jupyter_text>",
102
+ "lstrip": false,
103
+ "normalized": false,
104
+ "rstrip": false,
105
+ "single_word": false,
106
+ "special": true
107
+ },
108
+ "13": {
109
+ "content": "<jupyter_code>",
110
+ "lstrip": false,
111
+ "normalized": false,
112
+ "rstrip": false,
113
+ "single_word": false,
114
+ "special": true
115
+ },
116
+ "14": {
117
+ "content": "<jupyter_output>",
118
+ "lstrip": false,
119
+ "normalized": false,
120
+ "rstrip": false,
121
+ "single_word": false,
122
+ "special": true
123
+ },
124
+ "15": {
125
+ "content": "<jupyter_script>",
126
+ "lstrip": false,
127
+ "normalized": false,
128
+ "rstrip": false,
129
+ "single_word": false,
130
+ "special": true
131
+ },
132
+ "16": {
133
+ "content": "<empty_output>",
134
+ "lstrip": false,
135
+ "normalized": false,
136
+ "rstrip": false,
137
+ "single_word": false,
138
+ "special": true
139
+ }
140
+ },
141
+ "additional_special_tokens": [
142
+ "<|endoftext|>",
143
+ "<|im_start|>",
144
+ "<|im_end|>",
145
+ "<repo_name>",
146
+ "<reponame>",
147
+ "<file_sep>",
148
+ "<filename>",
149
+ "<gh_stars>",
150
+ "<issue_start>",
151
+ "<issue_comment>",
152
+ "<issue_closed>",
153
+ "<jupyter_start>",
154
+ "<jupyter_text>",
155
+ "<jupyter_code>",
156
+ "<jupyter_output>",
157
+ "<jupyter_script>",
158
+ "<empty_output>"
159
+ ],
160
+ "bos_token": "<|endoftext|>",
161
+ "clean_up_tokenization_spaces": false,
162
+ "eos_token": "<|endoftext|>",
163
+ "extra_special_tokens": {},
164
+ "model_max_length": 8192,
165
+ "pad_token": "<|endoftext|>",
166
+ "tokenizer_class": "GPT2Tokenizer",
167
+ "unk_token": "<|endoftext|>",
168
+ "vocab_size": 49152
169
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fa18376f9909c83f32050ced0d6c6c7cedb442baa389778a60a27db84743b9e7
3
+ size 5777
vocab.json ADDED
The diff for this file is too large to render. See raw diff