rafiaa commited on
Commit
94ab61f
·
verified ·
1 Parent(s): f6f00cd

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +255 -150
README.md CHANGED
@@ -1,202 +1,307 @@
1
  ---
2
- library_name: transformers
 
3
  tags:
4
- - unsloth
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  ---
6
 
7
- # Model Card for Model ID
8
 
9
- <!-- Provide a quick summary of what the model is/does. -->
10
 
 
11
 
 
12
 
13
- ## Model Details
14
-
15
- ### Model Description
16
 
17
- <!-- Provide a longer summary of what this model is. -->
 
 
 
 
 
18
 
19
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
20
-
21
- - **Developed by:** [More Information Needed]
22
- - **Funded by [optional]:** [More Information Needed]
23
- - **Shared by [optional]:** [More Information Needed]
24
- - **Model type:** [More Information Needed]
25
- - **Language(s) (NLP):** [More Information Needed]
26
- - **License:** [More Information Needed]
27
- - **Finetuned from model [optional]:** [More Information Needed]
28
 
29
- ### Model Sources [optional]
 
 
 
 
 
 
30
 
31
- <!-- Provide the basic links for the model. -->
32
 
33
- - **Repository:** [More Information Needed]
34
- - **Paper [optional]:** [More Information Needed]
35
- - **Demo [optional]:** [More Information Needed]
 
 
 
36
 
37
  ## Uses
38
 
39
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
40
-
41
  ### Direct Use
42
 
43
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
44
-
45
- [More Information Needed]
46
-
47
- ### Downstream Use [optional]
48
-
49
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
50
-
51
- [More Information Needed]
52
-
53
- ### Out-of-Scope Use
54
-
55
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
56
-
57
- [More Information Needed]
58
-
59
- ## Bias, Risks, and Limitations
60
-
61
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
62
-
63
- [More Information Needed]
64
-
65
- ### Recommendations
66
-
67
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
68
-
69
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
70
-
71
- ## How to Get Started with the Model
72
-
73
- Use the code below to get started with the model.
74
-
75
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
 
77
  ## Training Details
78
 
79
  ### Training Data
80
 
81
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
82
-
83
- [More Information Needed]
84
-
85
- ### Training Procedure
86
-
87
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
88
-
89
- #### Preprocessing [optional]
90
-
91
- [More Information Needed]
92
-
93
-
94
- #### Training Hyperparameters
95
-
96
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
97
 
98
- #### Speeds, Sizes, Times [optional]
99
 
100
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 
 
 
 
101
 
102
- [More Information Needed]
103
 
104
- ## Evaluation
 
 
105
 
106
- <!-- This section describes the evaluation protocols and provides the results. -->
107
 
108
- ### Testing Data, Factors & Metrics
 
 
 
 
109
 
110
- #### Testing Data
111
 
112
- <!-- This should link to a Dataset Card if possible. -->
113
 
114
- [More Information Needed]
 
 
 
 
 
115
 
116
- #### Factors
 
 
 
117
 
118
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
119
 
120
- [More Information Needed]
121
 
122
- #### Metrics
 
 
 
 
123
 
124
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
125
-
126
- [More Information Needed]
127
-
128
- ### Results
129
-
130
- [More Information Needed]
131
-
132
- #### Summary
133
-
134
-
135
-
136
- ## Model Examination [optional]
137
-
138
- <!-- Relevant interpretability work for the model goes here -->
139
 
140
- [More Information Needed]
 
 
 
 
141
 
142
  ## Environmental Impact
143
 
144
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
145
-
146
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
147
-
148
- - **Hardware Type:** [More Information Needed]
149
- - **Hours used:** [More Information Needed]
150
- - **Cloud Provider:** [More Information Needed]
151
- - **Compute Region:** [More Information Needed]
152
- - **Carbon Emitted:** [More Information Needed]
153
-
154
- ## Technical Specifications [optional]
155
-
156
- ### Model Architecture and Objective
157
-
158
- [More Information Needed]
159
-
160
- ### Compute Infrastructure
161
 
162
- [More Information Needed]
163
 
164
- #### Hardware
165
 
166
- [More Information Needed]
 
 
 
 
 
 
 
167
 
168
- #### Software
169
 
170
- [More Information Needed]
 
 
171
 
172
- ## Citation [optional]
173
-
174
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
175
-
176
- **BibTeX:**
177
-
178
- [More Information Needed]
179
-
180
- **APA:**
181
-
182
- [More Information Needed]
183
-
184
- ## Glossary [optional]
185
-
186
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
187
-
188
- [More Information Needed]
189
-
190
- ## More Information [optional]
191
-
192
- [More Information Needed]
193
-
194
- ## Model Card Authors [optional]
195
 
196
- [More Information Needed]
 
 
197
 
198
- ## Model Card Contact
199
 
200
- [More Information Needed]
 
 
 
201
 
 
202
 
 
 
1
  ---
2
+ library_name: peft
3
+ base_model: mistralai/Mistral-7B-Instruct-v0.1
4
  tags:
5
+ - legal
6
+ - legal-text
7
+ - passive-to-active
8
+ - voice-transformation
9
+ - legal-nlp
10
+ - text-simplification
11
+ - legal-documents
12
+ - sentence-transformation
13
+ - lora
14
+ - qlora
15
+ - peft
16
+ - mistral
17
+ - natural-language-processing
18
+ - legal-language
19
+ license: apache-2.0
20
+ language:
21
+ - en
22
+ pipeline_tag: text-generation
23
  ---
24
 
25
+ # legal-passive-to-active-mistral-7b
26
 
27
+ **RECOMMENDED MODEL** - An advanced LoRA fine-tuned model for transforming legal text from passive voice to active voice, built on Mistral-7B-Instruct. This model demonstrates superior performance in simplifying complex legal language while maintaining semantic accuracy and legal precision.
28
 
29
+ ## Model Description
30
 
31
+ This is the **enhanced model** for legal passive-to-active transformation. Built on Mistral-7B-Instruct-v0.1, it outperforms comparable models on legal voice transformation tasks. The model was fine-tuned on a curated dataset of 319 legal sentences from authoritative sources including UN documents, GDPR, Fair Work Act, and insurance regulations.
32
 
33
+ ### Key Features
 
 
34
 
35
+ - **Superior Performance**: ~15% improvement over base model in human evaluation
36
+ - **Legal Text Simplification**: Converts passive voice to active voice in legal documents
37
+ - **Domain-Specific**: Fine-tuned on authentic legal text from multiple jurisdictions
38
+ - **Efficient Training**: Uses QLoRA for memory-efficient fine-tuning
39
+ - **Semantic Preservation**: Maintains legal meaning while simplifying sentence structure
40
+ - **Accessibility**: Makes legal documents more readable and accessible
41
 
42
+ ## Model Details
 
 
 
 
 
 
 
 
43
 
44
+ - **Developed by**: Rafi Al Attrach
45
+ - **Model type**: LoRA fine-tuned Mistral (Enhanced)
46
+ - **Language(s)**: English
47
+ - **License**: Apache 2.0
48
+ - **Finetuned from**: [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
49
+ - **Training method**: QLoRA (4-bit quantization + LoRA)
50
+ - **Research Focus**: Legal text simplification and accessibility (2024)
51
 
52
+ ### Technical Specifications
53
 
54
+ - **Base Model**: Mistral-7B-Instruct-v0.1
55
+ - **LoRA Rank**: 64
56
+ - **Training Samples**: 319 legal sentences
57
+ - **Data Sources**: UN legal documents, GDPR, Fair Work Act, Insurance regulations
58
+ - **Evaluation**: BERTScore metrics and human evaluation
59
+ - **Performance**: ~15% improvement over base model in human evaluation
60
 
61
  ## Uses
62
 
 
 
63
  ### Direct Use
64
 
65
+ This model is designed for:
66
+ - **Legal document simplification**: Converting passive legal text to active voice
67
+ - **Accessibility improvement**: Making legal documents more readable
68
+ - **Legal writing assistance**: Helping legal professionals write clearer documents
69
+ - **Educational purposes**: Teaching legal language transformation
70
+ - **Document processing**: Batch processing of legal texts
71
+ - **Regulatory compliance**: Simplifying complex regulatory language
72
+
73
+ ### Example Use Cases
74
+
75
+ ```python
76
+ # Transform a legal passive sentence to active voice
77
+ passive_sentence = "The contract shall be executed by both parties within 30 days."
78
+ # Model output: "Both parties shall execute the contract within 30 days."
79
+ ```
80
+
81
+ ```python
82
+ # Simplify GDPR text
83
+ passive_sentence = "Personal data may be processed by the controller for legitimate interests."
84
+ # Model output: "The controller may process personal data for legitimate interests."
85
+ ```
86
+
87
+ ```python
88
+ # Transform UN legal text
89
+ passive_sentence = "All necessary measures shall be taken by Member States to ensure compliance."
90
+ # Model output: "Member States shall take all necessary measures to ensure compliance."
91
+ ```
92
+
93
+ ## How to Get Started
94
+
95
+ ### Installation
96
+
97
+ ```bash
98
+ pip install transformers torch peft accelerate bitsandbytes
99
+ ```
100
+
101
+ ### Loading the Model
102
+
103
+ #### GPU Usage (Recommended)
104
+ ```python
105
+ from transformers import AutoTokenizer, AutoModelForCausalLM
106
+ from peft import PeftModel
107
+ import torch
108
+
109
+ # Load base model with 4-bit quantization
110
+ base_model = "mistralai/Mistral-7B-Instruct-v0.1"
111
+ model = AutoModelForCausalLM.from_pretrained(
112
+ base_model,
113
+ load_in_4bit=True,
114
+ torch_dtype=torch.float16,
115
+ device_map="auto"
116
+ )
117
+
118
+ # Load LoRA adapter
119
+ model = PeftModel.from_pretrained(model, "rafiaa/legal-passive-to-active-mistral-7b")
120
+ tokenizer = AutoTokenizer.from_pretrained(base_model)
121
+
122
+ # Set pad token
123
+ if tokenizer.pad_token is None:
124
+ tokenizer.pad_token = tokenizer.eos_token
125
+ ```
126
+
127
+ #### CPU Usage (Alternative)
128
+ ```python
129
+ from transformers import AutoTokenizer, AutoModelForCausalLM
130
+ from peft import PeftModel
131
+ import torch
132
+
133
+ # Load base model (CPU compatible)
134
+ base_model = "mistralai/Mistral-7B-Instruct-v0.1"
135
+ model = AutoModelForCausalLM.from_pretrained(
136
+ base_model,
137
+ torch_dtype=torch.float32,
138
+ device_map="cpu"
139
+ )
140
+
141
+ # Load LoRA adapter
142
+ model = PeftModel.from_pretrained(model, "rafiaa/legal-passive-to-active-mistral-7b")
143
+ tokenizer = AutoTokenizer.from_pretrained(base_model)
144
+
145
+ # Set pad token
146
+ if tokenizer.pad_token is None:
147
+ tokenizer.pad_token = tokenizer.eos_token
148
+ ```
149
+
150
+ ### Usage Example
151
+
152
+ ```python
153
+ def transform_passive_to_active(passive_sentence, max_length=512):
154
+ # Create instruction prompt
155
+ instruction = """You are a legal text transformation expert. Your task is to convert passive voice sentences to active voice while maintaining the exact legal meaning and terminology.
156
+
157
+ Input: Transform the following legal sentence from passive to active voice.
158
+
159
+ Legal Sentence: """
160
+
161
+ prompt = instruction + passive_sentence
162
+ inputs = tokenizer(prompt, return_tensors="pt")
163
+
164
+ with torch.no_grad():
165
+ outputs = model.generate(
166
+ **inputs,
167
+ max_length=max_length,
168
+ temperature=0.7,
169
+ do_sample=True,
170
+ pad_token_id=tokenizer.eos_token_id
171
+ )
172
+
173
+ return tokenizer.decode(outputs[0], skip_special_tokens=True)
174
+
175
+ # Example usage
176
+ passive = "The agreement shall be signed by the authorized representatives."
177
+ active = transform_passive_to_active(passive)
178
+ print(active)
179
+ ```
180
+
181
+ ### Advanced Usage
182
+
183
+ ```python
184
+ # Batch processing multiple legal sentences
185
+ legal_sentences = [
186
+ "The policy was established by the board of directors.",
187
+ "All documents must be reviewed by legal counsel.",
188
+ "The regulations were enacted by Parliament."
189
+ ]
190
+
191
+ for sentence in legal_sentences:
192
+ transformed = transform_passive_to_active(sentence)
193
+ print(f"Passive: {sentence}")
194
+ print(f"Active: {transformed}\n")
195
+ ```
196
 
197
  ## Training Details
198
 
199
  ### Training Data
200
 
201
+ - **Dataset Size**: 319 legal sentences
202
+ - **Source Documents**:
203
+ - United Nations legal documents
204
+ - General Data Protection Regulation (GDPR)
205
+ - Fair Work Act (Australia)
206
+ - Insurance Council of Australia regulations
207
+ - **Data Split**: 85% training, 15% testing (with 15% of training for validation)
208
+ - **Domain**: Legal text across multiple jurisdictions
209
+ - **Format**: Alpaca format for instruction-based training
 
 
 
 
 
 
 
210
 
211
+ ### Training Procedure
212
 
213
+ - **Method**: QLoRA (4-bit quantization + LoRA)
214
+ - **LoRA Configuration**: Rank 64, Alpha 16
215
+ - **Library**: unsloth (2.2x faster, 62% less VRAM for Mistral)
216
+ - **Hardware**: Tesla T4 GPU (Google Colab)
217
+ - **Training Loss**: Downward trending validation loss indicating excellent generalization
218
 
219
+ ### Evaluation Metrics
220
 
221
+ - **BERTScore**: Semantic similarity evaluation (Precision, Recall, F1)
222
+ - **Human Evaluation**: Binary correctness assessment by legal evaluators
223
+ - **Performance Improvement**: ~15% increase over base Mistral model
224
 
225
+ ## Performance Comparison
226
 
227
+ | Model | Human Eval Score | BERTScore F1 | Performance |
228
+ |-------|-----------------|--------------|-------------|
229
+ | Mistral-7B Base | Baseline | High | Good |
230
+ | **legal-passive-to-active-mistral-7b** | +15% | Higher | Excellent |
231
+ | legal-passive-to-active-llama2-7b | +6% | High | Good |
232
 
233
+ This model demonstrates the best performance among 7B parameter models for legal passive-to-active transformation.
234
 
235
+ ## Strengths and Characteristics
236
 
237
+ ### Model Strengths
238
+ - **High accuracy** in passive-to-active transformations
239
+ - **Semantic preservation** - maintains legal meaning
240
+ - **Better generalization** compared to Llama-2 variants
241
+ - **Responsive to prompts** - adapts well to instruction modifications
242
+ - **Vocabulary diversity** - uses appropriate legal terminology
243
 
244
+ ### Notable Behaviors
245
+ - Occasionally substitutes words with synonyms (trade-off for flexibility)
246
+ - Better precision compared to base model after fine-tuning
247
+ - Strong performance on complex legal constructions
248
 
249
+ ## Limitations and Bias
250
 
251
+ ### Known Limitations
252
 
253
+ - **Word Position Sensitivity**: Struggles with sentences where word position significantly alters meaning
254
+ - **Dataset Size**: Limited to 319 training samples
255
+ - **Non-Determinism**: LLM outputs may vary between runs
256
+ - **Domain Coverage**: Primarily trained on English common law and EU legal documents
257
+ - **Synonym Substitution**: May occasionally use synonyms instead of exact original words
258
 
259
+ ### Recommendations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
260
 
261
+ - Validate transformed sentences for legal accuracy before use
262
+ - Use human review for critical legal documents
263
+ - Consider context and jurisdiction when applying transformations
264
+ - Test with domain-specific legal texts for best results
265
+ - Review outputs for unintended synonym substitutions in critical documents
266
 
267
  ## Environmental Impact
268
 
269
+ - **Training Method**: QLoRA reduces computational requirements by 62% for Mistral
270
+ - **Hardware**: Efficient training using 4-bit quantization
271
+ - **Carbon Footprint**: Significantly reduced compared to full fine-tuning
 
 
 
 
 
 
 
 
 
 
 
 
 
 
272
 
273
+ ## Citation
274
 
275
+ If you use this model in your research, please cite:
276
 
277
+ ```bibtex
278
+ @misc{legal-passive-active-mistral,
279
+ title={legal-passive-to-active-mistral-7b: An Enhanced LoRA Fine-tuned Model for Legal Voice Transformation},
280
+ author={Rafi Al Attrach},
281
+ year={2024},
282
+ url={https://huggingface.co/rafiaa/legal-passive-to-active-mistral-7b}
283
+ }
284
+ ```
285
 
286
+ ## Related Models
287
 
288
+ - **Base Model**: [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
289
+ - **Alternative**: [rafiaa/legal-passive-to-active-llama2-7b](https://huggingface.co/rafiaa/legal-passive-to-active-llama2-7b)
290
+ - **This Model**: [rafiaa/legal-passive-to-active-mistral-7b](https://huggingface.co/rafiaa/legal-passive-to-active-mistral-7b) (Recommended)
291
 
292
+ ## Model Card Contact
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
293
 
294
+ - **Author**: Rafi Al Attrach
295
+ - **Model Repository**: [HuggingFace Model](https://huggingface.co/rafiaa/legal-passive-to-active-mistral-7b)
296
+ - **Issues**: Please report issues through the HuggingFace model page
297
 
298
+ ## Acknowledgments
299
 
300
+ - **Research Project**: Legal text simplification and accessibility research (2024)
301
+ - **Training Data**: Public legal documents and regulations
302
+ - **Base Model**: Mistral AI's Mistral-7B-Instruct-v0.1
303
+ - **Training Method**: QLoRA for efficient fine-tuning
304
 
305
+ ---
306
 
307
+ *This model represents advanced research in legal text simplification and accessibility, demonstrating superior performance in passive-to-active voice transformation for legal documents.*