Text Classification
Safetensors
English
bert
OzzeY72 commited on
Commit
128dc93
·
verified ·
1 Parent(s): 322d1c0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -1
README.md CHANGED
@@ -41,4 +41,81 @@ language:
41
  base_model:
42
  - dmis-lab/biobert-base-cased-v1.1
43
  pipeline_tag: text-classification
44
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
  base_model:
42
  - dmis-lab/biobert-base-cased-v1.1
43
  pipeline_tag: text-classification
44
+ ---
45
+ # BioBERT Symptom Text Classifier 🧬🩺
46
+
47
+ This model is a fine-tuned version of [**dmis-lab/biobert-base-cased-v1.1**](https://huggingface.co/dmis-lab/biobert-base-cased-v1.1) on a symptom-to-condition classification task. It maps free-form medical symptom descriptions in English to 25 predefined symptom categories such as "back pain", "headache", "injury from sports", etc.
48
+
49
+ ## 🧠 Model Details
50
+
51
+ - **Architecture:** BioBERT (Transformer-based)
52
+ - **Base Model:** [`dmis-lab/biobert-base-cased-v1.1`](https://huggingface.co/dmis-lab/biobert-base-cased-v1.1)
53
+ - **Task:** Text Classification (Single-label)
54
+ - **Labels:** 25 symptom categories (see full list below)
55
+ - **Language:** English
56
+ - **License:** Apache 2.0
57
+
58
+ ## 📊 Datasets Used
59
+
60
+ This model was trained on a combination of public datasets containing free-text symptom descriptions annotated with associated pain types or complaints:
61
+
62
+ - [`venetis/symptom_text_to_disease_mk3`](https://huggingface.co/datasets/venetis/symptom_text_to_disease_mk3)
63
+ - [`celikmus/symptom_text_to_disease_01`](https://huggingface.co/datasets/celikmus/symptom_text_to_disease_01)
64
+
65
+ ## 🏷️ Label Set (25 Classes)
66
+
67
+ The model predicts one of the following 25 labels:
68
+
69
+ | ID | Symptom Category |
70
+ |----|------------------------|
71
+ | 0 | emotional pain |
72
+ | 1 | hair falling out |
73
+ | 2 | heart hurts |
74
+ | 3 | infected wound |
75
+ | 4 | foot ache |
76
+ | 5 | shoulder pain |
77
+ | 6 | injury from sports |
78
+ | 7 | skin issue |
79
+ | 8 | stomach ache |
80
+ | 9 | knee pain |
81
+ | 10 | joint pain |
82
+ | 11 | hard to breath |
83
+ | 12 | head ache |
84
+ | 13 | body feels weak |
85
+ | 14 | feeling dizzy |
86
+ | 15 | back pain |
87
+ | 16 | open wound |
88
+ | 17 | internal pain |
89
+ | 18 | blurry vision |
90
+ | 19 | acne |
91
+ | 20 | muscle pain |
92
+ | 21 | neck pain |
93
+ | 22 | cough |
94
+ | 23 | ear ache |
95
+ | 24 | feeling cold |
96
+
97
+ ## 🚀 Usage
98
+
99
+ To use the model in your project:
100
+
101
+ ```python
102
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
103
+ import torch
104
+
105
+ model_name = "your-username/your-model-name" # Replace with actual path
106
+
107
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
108
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
109
+ model.eval()
110
+
111
+ def classify_symptom(text):
112
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
113
+ with torch.no_grad():
114
+ outputs = model(**inputs)
115
+ predicted_class_id = torch.argmax(outputs.logits, dim=-1).item()
116
+ label = model.config.id2label[predicted_class_id]
117
+ return label
118
+
119
+ # Example
120
+ classify_symptom("My lower back hurts when I sit for a long time")
121
+ # ➜ "back pain"