- 
	
	
	
Attention Is All You Need
Paper • 1706.03762 • Published • 94 - 
	
	
	
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 23 - 
	
	
	
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper • 1907.11692 • Published • 9 - 
	
	
	
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 20 
Taufiq Dwi Purnomo
taufiqdp
		AI & ML interests
SLM, VLM
		Recent Activity
						upvoted 
								a
								paper
							
						2 days ago
						
					
						
						
						Scaling Latent Reasoning via Looped Language Models
						
						upvoted 
								a
								paper
							
						8 days ago
						
					
						
						
						OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding
  LLM