- 
	
	
	
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 260 - 
	
	
	
Audiobox: Unified Audio Generation with Natural Language Prompts
Paper • 2312.15821 • Published • 17 - 
	
	
	
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper • 2312.16862 • Published • 31 - 
	
	
	
LLaMA Pro: Progressive LLaMA with Block Expansion
Paper • 2401.02415 • Published • 53 
Adhi Setiawan
adhisetiawan
		AI & ML interests
Computer Vision, Reinforcement Learning
		Recent Activity
						liked
								a model
							
						1 day ago
						
					
						
						
						
						allenai/olmOCR-2-7B-1025-FP8
						
						upvoted 
								a
								paper
							
						9 days ago
						
					
						
						
						Search Self-play: Pushing the Frontier of Agent Capability without
  Supervision
						Organizations
SLMs
			
			
	
	Audio
			
			
	
	Multimodal Models
			
			
	
	LLMs
			
			
	
	Multimodal Papers
			
			
	
	- 
	
	
	
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 17 - 
	
	
	
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 9 - 
	
	
	
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning
Paper • 2311.07574 • Published • 16 - 
	
	
	
MyVLM: Personalizing VLMs for User-Specific Queries
Paper • 2403.14599 • Published • 17 
Papers
			
			
	
	- 
	
	
	
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 260 - 
	
	
	
Audiobox: Unified Audio Generation with Natural Language Prompts
Paper • 2312.15821 • Published • 17 - 
	
	
	
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper • 2312.16862 • Published • 31 - 
	
	
	
LLaMA Pro: Progressive LLaMA with Block Expansion
Paper • 2401.02415 • Published • 53 
Multimodal Models
			
			
	
	SLMs
			
			
	
	LLMs
			
			
	
	Audio
			
			
	
	Multimodal Papers
			
			
	
	- 
	
	
	
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 17 - 
	
	
	
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 9 - 
	
	
	
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning
Paper • 2311.07574 • Published • 16 - 
	
	
	
MyVLM: Personalizing VLMs for User-Specific Queries
Paper • 2403.14599 • Published • 17