hateslopacademy
/

otpensource-vision

@@ -1,138 +1,21 @@
 ---
 language:
-- ko
 - en
-library_name: transformers
-tags:
-- vision-language
-- korean
-- image-to-text
-- multilingual
-- fashion
-- e-commerce
-- text-classification
-- text-generation
-datasets:
-- hateslopacademy/otpensource_dataset
-base_model:
-- Bllossom/llama-3.2-Korean-Bllossom-AICA-5B
-inference: true
-license: llama3.2
-model_name: otpensource-vision
-size_categories: 1K<n<10K
-task_categories:
-- image-to-text
-- text-classification
-task_ids:
-- image-captioning
-- sentiment-analysis
----
-# otpensource-vision
-## 모델 설명
-**otpensource-vision**은 **Bllossom/llama-3.2-Korean-Bllossom-AICA-5B**를 기반으로 학습된 Vision-Language 모델입니다. 해당 모델은 한국어와 영어로 작성된 텍스트와 이미지를 결합하여 다양한 태스크를 수행할 수 있도록 설계되었습니다.
-### 주요 특징
-- **Bllossom 기반 학습**: llama-3.2-Korean-Bllossom-AICA-5B를 기반으로 학습된 모델로, 언어 모델과 시각-언어 모델의 장점을 모두 제공합니다.
-- **Vision-Language 태스크 지원**: 이미지를 입력받아 텍스트 정보를 생성하거나, 텍스트 입력만으로 자연어 처리 태스크를 수행할 수 있습니다.
-- **패션 데이터를 활용한 학습**: 한국어 패션 데이터셋(otpensource_data)을 활용하여 옷의 카테고리, 색상, 계절, 특징 등 관련 정보를 추출하도록 학습되었습니다.
-- **상업적 활용 가능**: 라이선스는 CC-BY-4.0으로 상업적 이용이 가능합니다.
----
-## 모델 세부사항
-### 학습 데이터
-모델 학습에 사용된 데이터셋:
-- **[otpensource_data](https://huggingface.co/datasets/hateslopacademy/otpensource_dataset)**:
-  - 약 9000개의 패션 데이터로 구성
-  - 옷의 카테고리, 색상, 계절, 특징, 이미지 URL 등을 포함하며, Vision-Language 학습에 최적화
-### 학습 방식
-- **기반 모델**: Bllossom/llama-3.2-Korean-Bllossom-AICA-5B
-- **GPU 요구사항**: A100 40GB 이상 권장
-- **최적화**: Vision-Language 태스크와 한국어 텍스트 태스크를 통합적으로 학습
----
-## 주요 사용 사례
-### Vision-Language 태스크
-1. **이미지 분석**
-   - 입력된 이미지에서 옷의 카테고리, 색상, 계절, 특징을 추출하여 JSON 형식으로 반환.
-   - 예시:
-     ```json
-     {
-       "category": "트렌치코트",
-       "gender": "여",
-       "season": "SS",
-       "color": "네이비",
-       "material": "면",
-       "feature": "트렌치코트"
-     }
-     ```
-2. **언어모델 태스크**
-   - 텍스트만 입력했을 때 자연어 처리를 수행하며, 질문 응답, 텍스트 요약, 감정 분석 등 다양한 태스크 수행 가능.
----
-## 학습 및 성능
-### LogicKor 벤치마크 성능 (Bllossom 기반 모델 성능)
-| Category       | Single Turn | Multi Turn |
-|----------------|-------------|------------|
-| Reasoning      | 6.57        | 5.29       |
-| Math           | 6.43        | 6.29       |
-| Writing        | 9.14        | 8.71       |
-| Coding         | 8.00        | 9.14       |
-| Understanding  | 8.14        | 9.29       |
-| Grammar        | 6.71        | 4.86       |
-### 학습 구성
-- **모델 크기**: 5B 파라미터
-- **학습 데이터 크기**: 약 9000개의 시각-언어 데이터
-- **평가 결과**: 패션 관련 태스크에서 높은 정확도와 효율성 제공
 ---
-## 코드 예시
-### Vision-Language 태스크
-```python
-from transformers import MllamaForConditionalGeneration, MllamaProcessor
-import torch
-from PIL import Image
-import requests
-model = MllamaForConditionalGeneration.from_pretrained(
-  'otpensource-vision',
-  torch_dtype=torch.bfloat16,
-  device_map='auto'
-)
-processor = MllamaProcessor.from_pretrained('otpensource-vision')
-url = "https://image.msscdn.net/thumbnails/images/prd_img/20240710/4242307/detail_4242307_17205916382801_big.jpg?w=1200"
-image = Image.open(requests.get(url, stream=True).raw)
-messages = [
-  {'role': 'user', 'content': [
-    {'type': 'image', 'image': image},
-    {'type': 'text', 'text': '이 옷의 정보를 JSON으로 알려줘.'}
-  ]}
-]
-input_text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
-inputs = processor(
-    image=image,
-    text=input_text,
-    add_special_tokens=False,
-    return_tensors="pt",
-).to(model.device)
-output = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
-print(processor.decode(output[0]))

 ---
+base_model: Bllossom/llama-3.2-Korean-Bllossom-AICA-5B
+tags:
+- text-generation-inference
+- transformers
+- unsloth
+- mllama
+license: apache-2.0
 language:
 - en
 ---
+# Uploaded finetuned  model
+- **Developed by:** hateslopacademy
+- **License:** apache-2.0
+- **Finetuned from model :** Bllossom/llama-3.2-Korean-Bllossom-AICA-5B
+This mllama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
+[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)