taufiqdp
			's Collections
			 
		
			
				
				
	
	
	
			
			Attention Is All You Need
		
			Paper
			
•
			1706.03762
			
•
			Published
				
			•
				
				94
			
 
	
	 
	
	
	
			
			BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
		
			Paper
			
•
			1810.04805
			
•
			Published
				
			•
				
				23
			
 
	
	 
	
	
	
			
			RoBERTa: A Robustly Optimized BERT Pretraining Approach
		
			Paper
			
•
			1907.11692
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			DistilBERT, a distilled version of BERT: smaller, faster, cheaper and
  lighter
		
			Paper
			
•
			1910.01108
			
•
			Published
				
			•
				
				20
			
 
	
	 
	
	
	
			
			Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
		
			Paper
			
•
			1910.10683
			
•
			Published
				
			•
				
				14
			
 
	
	 
	
	
	
			
			Switch Transformers: Scaling to Trillion Parameter Models with Simple
  and Efficient Sparsity
		
			Paper
			
•
			2101.03961
			
•
			Published
				
			•
				
				13
			
 
	
	 
	
	
	
			
			Finetuned Language Models Are Zero-Shot Learners
		
			Paper
			
•
			2109.01652
			
•
			Published
				
			•
				
				4
			
 
	
	 
	
	
	
			
			Multitask Prompted Training Enables Zero-Shot Task Generalization
		
			Paper
			
•
			2110.08207
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
		
			Paper
			
•
			2112.06905
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Scaling Language Models: Methods, Analysis & Insights from Training
  Gopher
		
			Paper
			
•
			2112.11446
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
		
			Paper
			
•
			2201.11903
			
•
			Published
				
			•
				
				14
			
 
	
	 
	
	
	
			
			LaMDA: Language Models for Dialog Applications
		
			Paper
			
•
			2201.08239
			
•
			Published
				
			•
				
				4
			
 
	
	 
	
	
	
			
			Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A
  Large-Scale Generative Language Model
		
			Paper
			
•
			2201.11990
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Training language models to follow instructions with human feedback
		
			Paper
			
•
			2203.02155
			
•
			Published
				
			•
				
				24
			
 
	
	 
	
	
	
			
			PaLM: Scaling Language Modeling with Pathways
		
			Paper
			
•
			2204.02311
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			Training Compute-Optimal Large Language Models
		
			Paper
			
•
			2203.15556
			
•
			Published
				
			•
				
				11
			
 
	
	 
	
	
	
			
			OPT: Open Pre-trained Transformer Language Models
		
			Paper
			
•
			2205.01068
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			UL2: Unifying Language Learning Paradigms
		
			Paper
			
•
			2205.05131
			
•
			Published
				
			•
				
				5
			
 
	
	 
	
	
	
			
			Language Models are General-Purpose Interfaces
		
			Paper
			
•
			2206.06336
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Improving alignment of dialogue agents via targeted human judgements
		
			Paper
			
•
			2209.14375
			
•
			Published
				
			
			 
	
	 
	
	
	
			
			Scaling Instruction-Finetuned Language Models
		
			Paper
			
•
			2210.11416
			
•
			Published
				
			•
				
				7
			
 
	
	 
	
	
	
			
			GLM-130B: An Open Bilingual Pre-trained Model
		
			Paper
			
•
			2210.02414
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			Holistic Evaluation of Language Models
		
			Paper
			
•
			2211.09110
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
		
			Paper
			
•
			2211.05100
			
•
			Published
				
			•
				
				34
			
 
	
	 
	
	
	
			
			Galactica: A Large Language Model for Science
		
			Paper
			
•
			2211.09085
			
•
			Published
				
			•
				
				4
			
 
	
	 
	
	
	
			
			OPT-IML: Scaling Language Model Instruction Meta Learning through the
  Lens of Generalization
		
			Paper
			
•
			2212.12017
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			The Flan Collection: Designing Data and Methods for Effective
  Instruction Tuning
		
			Paper
			
•
			2301.13688
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			LLaMA: Open and Efficient Foundation Language Models
		
			Paper
			
•
			2302.13971
			
•
			Published
				
			•
				
				18
			
 
	
	 
	
	
	
			
			PaLM-E: An Embodied Multimodal Language Model
		
			Paper
			
•
			2303.03378
			
•
			Published
				
			
			 
	
	 
	
	
	
		
			Paper
			
•
			2303.08774
			
•
			Published
				
			•
				
				7
			
 
	
	 
	
	
	
			
			Pythia: A Suite for Analyzing Large Language Models Across Training and
  Scaling
		
			Paper
			
•
			2304.01373
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
		
			Paper
			
•
			2305.10403
			
•
			Published
				
			•
				
				7
			
 
	
	 
	
	
	
			
			RWKV: Reinventing RNNs for the Transformer Era
		
			Paper
			
•
			2305.13048
			
•
			Published
				
			•
				
				19
			
 
	
	 
	
	
	
			
			Llama 2: Open Foundation and Fine-Tuned Chat Models
		
			Paper
			
•
			2307.09288
			
•
			Published
				
			•
				
				246
			
 
	
	 
	
	
	
			
			Mamba: Linear-Time Sequence Modeling with Selective State Spaces
		
			Paper
			
•
			2312.00752
			
•
			Published
				
			•
				
				146
			
 
	
	 
	
	
	
			
			Orca: Progressive Learning from Complex Explanation Traces of GPT-4
		
			Paper
			
•
			2306.02707
			
•
			Published
				
			•
				
				47
			
 
	
	 
	
	
	
			
			Textbooks Are All You Need
		
			Paper
			
•
			2306.11644
			
•
			Published
				
			•
				
				149
			
 
	
	 
	
	
	
			
			Textbooks Are All You Need II: phi-1.5 technical report
		
			Paper
			
•
			2309.05463
			
•
			Published
				
			•
				
				88
			
 
	
	 
	
	
	
		
			Paper
			
•
			2310.06825
			
•
			Published
				
			•
				
				55
			
 
	
	 
	
	
	
			
			PaLI-3 Vision Language Models: Smaller, Faster, Stronger
		
			Paper
			
•
			2310.09199
			
•
			Published
				
			•
				
				29
			
 
	
	 
	
	
	
			
			Zephyr: Direct Distillation of LM Alignment
		
			Paper
			
•
			2310.16944
			
•
			Published
				
			•
				
				122
			
 
	
	 
	
	
	
			
			CodeFusion: A Pre-trained Diffusion Model for Code Generation
		
			Paper
			
•
			2310.17680
			
•
			Published
				
			•
				
				73
			
 
	
	 
	
	
	
			
			LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
		
			Paper
			
•
			2311.05437
			
•
			Published
				
			•
				
				51
			
 
	
	 
	
	
	
			
			MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
		
			Paper
			
•
			2311.16079
			
•
			Published
				
			•
				
				19
			
 
	
	 
	
	
	
			
			SeaLLMs -- Large Language Models for Southeast Asia
		
			Paper
			
•
			2312.00738
			
•
			Published
				
			•
				
				25
			
 
	
	 
	
	
	
			
			Kandinsky 3.0 Technical Report
		
			Paper
			
•
			2312.03511
			
•
			Published
				
			•
				
				46
			
 
	
	 
	
	
	
			
			Large Language Models for Mathematicians
		
			Paper
			
•
			2312.04556
			
•
			Published
				
			•
				
				13
			
 
	
	 
	
	
	
			
			FLM-101B: An Open LLM and How to Train It with $100K Budget
		
			Paper
			
•
			2309.03852
			
•
			Published
				
			•
				
				44
			
 
	
	 
	
	
	
		
			Paper
			
•
			2309.03450
			
•
			Published
				
			•
				
				8
			
 
	
	 
	
	
	
			
			Baichuan 2: Open Large-scale Language Models
		
			Paper
			
•
			2309.10305
			
•
			Published
				
			•
				
				20
			
 
	
	 
	
	
	
		
			Paper
			
•
			2309.16609
			
•
			Published
				
			•
				
				37
			
 
	
	 
	
	
	
			
			OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model
  Pre-trained from Scratch
		
			Paper
			
•
			2309.10706
			
•
			Published
				
			•
				
				17
			
 
	
	 
	
	
	
			
			MiniGPT-v2: large language model as a unified interface for
  vision-language multi-task learning
		
			Paper
			
•
			2310.09478
			
•
			Published
				
			•
				
				21
			
 
	
	 
	
	
	
			
			Position-Enhanced Visual Instruction Tuning for Multimodal Large
  Language Models
		
			Paper
			
•
			2308.13437
			
•
			Published
				
			•
				
				4
			
 
	
	 
	
	
	
			
			InstructionGPT-4: A 200-Instruction Paradigm for Fine-Tuning MiniGPT-4
		
			Paper
			
•
			2308.12067
			
•
			Published
				
			•
				
				4
			
 
	
	 
	
	
	
			
			JudgeLM: Fine-tuned Large Language Models are Scalable Judges
		
			Paper
			
•
			2310.17631
			
•
			Published
				
			•
				
				35
			
 
	
	 
	
	
	
			
			ChatCoder: Chat-based Refine Requirement Improves LLMs' Code Generation
		
			Paper
			
•
			2311.00272
			
•
			Published
				
			•
				
				11
			
 
	
	 
	
	
	
			
			ChipNeMo: Domain-Adapted LLMs for Chip Design
		
			Paper
			
•
			2311.00176
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model
		
			Paper
			
•
			2310.06266
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models
		
			Paper
			
•
			2312.04724
			
•
			Published
				
			•
				
				21
			
 
	
	 
	
	
	
			
			SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective
  Depth Up-Scaling
		
			Paper
			
•
			2312.15166
			
•
			Published
				
			•
				
				60
			
 
	
	 
	
	
	
			
			Generative Multimodal Models are In-Context Learners
		
			Paper
			
•
			2312.13286
			
•
			Published
				
			•
				
				37
			
 
	
	 
	
	
	
			
			Code Llama: Open Foundation Models for Code
		
			Paper
			
•
			2308.12950
			
•
			Published
				
			•
				
				29
			
 
	
	 
	
	
	
			
			Unsupervised Cross-lingual Representation Learning at Scale
		
			Paper
			
•
			1911.02116
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			YAYI 2: Multilingual Open-Source Large Language Models
		
			Paper
			
•
			2312.14862
			
•
			Published
				
			•
				
				15
			
 
	
	 
	
	
	
			
			Mini-GPTs: Efficient Large Language Models through Contextual Pruning
		
			Paper
			
•
			2312.12682
			
•
			Published
				
			•
				
				10
			
 
	
	 
	
	
	
			
			Gemini: A Family of Highly Capable Multimodal Models
		
			Paper
			
•
			2312.11805
			
•
			Published
				
			•
				
				47
			
 
	
	 
	
	
	
			
			LLM360: Towards Fully Transparent Open-Source LLMs
		
			Paper
			
•
			2312.06550
			
•
			Published
				
			•
				
				57
			
 
	
	 
	
	
	
			
			WizardLM: Empowering Large Language Models to Follow Complex
  Instructions
		
			Paper
			
•
			2304.12244
			
•
			Published
				
			•
				
				13
			
 
	
	 
	
	
	
			
			The Falcon Series of Open Language Models
		
			Paper
			
•
			2311.16867
			
•
			Published
				
			•
				
				14
			
 
	
	 
	
	
	
			
			Clinical Camel: An Open-Source Expert-Level Medical Language Model with
  Dialogue-Based Knowledge Encoding
		
			Paper
			
•
			2305.12031
			
•
			Published
				
			•
				
				5
			
 
	
	 
	
	
	
			
			ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical
  Domain Knowledge
		
			Paper
			
•
			2303.14070
			
•
			Published
				
			•
				
				10
			
 
	
	 
	
	
	
			
			LLaVA-Med: Training a Large Language-and-Vision Assistant for
  Biomedicine in One Day
		
			Paper
			
•
			2306.00890
			
•
			Published
				
			•
				
				11
			
 
	
	 
	
	
	
			
			BioLORD-2023: Semantic Textual Representations Fusing LLM and Clinical
  Knowledge Graph Insights
		
			Paper
			
•
			2311.16075
			
•
			Published
				
			•
				
				6
			
 
	
	 
	
	
	
			
			KBioXLM: A Knowledge-anchored Biomedical Multilingual Pretrained
  Language Model
		
			Paper
			
•
			2311.11564
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			ChiMed-GPT: A Chinese Medical Large Language Model with Full Training
  Regime and Better Alignment to Human Preferences
		
			Paper
			
•
			2311.06025
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			BioT5: Enriching Cross-modal Integration in Biology with Chemical
  Knowledge and Natural Language Associations
		
			Paper
			
•
			2310.07276
			
•
			Published
				
			•
				
				5
			
 
	
	 
	
	
	
			
			BIOptimus: Pre-training an Optimal Biomedical Language Model with
  Curriculum Learning for Named Entity Recognition
		
			Paper
			
•
			2308.08625
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			BioCPT: Contrastive Pre-trained Transformers with Large-scale PubMed
  Search Logs for Zero-shot Biomedical Information Retrieval
		
			Paper
			
•
			2307.00589
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Radiology-GPT: A Large Language Model for Radiology
		
			Paper
			
•
			2306.08666
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained
  Transformer for Vision, Language, and Multimodal Tasks
		
			Paper
			
•
			2305.17100
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Dr. LLaMA: Improving Small Language Models in Domain-Specific QA via
  Generative Data Augmentation
		
			Paper
			
•
			2305.07804
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Llemma: An Open Language Model For Mathematics
		
			Paper
			
•
			2310.10631
			
•
			Published
				
			•
				
				56
			
 
	
	 
	
	
	
			
			BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
		
			Paper
			
•
			2309.11568
			
•
			Published
				
			•
				
				11
			
 
	
	 
	
	
	
			
			Skywork: A More Open Bilingual Foundation Model
		
			Paper
			
•
			2310.19341
			
•
			Published
				
			•
				
				6
			
 
	
	 
	
	
	
			
			SkyMath: Technical Report
		
			Paper
			
•
			2310.16713
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			MetaMath: Bootstrap Your Own Mathematical Questions for Large Language
  Models
		
			Paper
			
•
			2309.12284
			
•
			Published
				
			•
				
				18
			
 
	
	 
	
	
	
			
			UT5: Pretraining Non autoregressive T5 with unrolled denoising
		
			Paper
			
•
			2311.08552
			
•
			Published
				
			•
				
				8
			
 
	
	 
	
	
	
			
			G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
		
			Paper
			
•
			2312.11370
			
•
			Published
				
			•
				
				20
			
 
	
	 
	
	
	
			
			Language Is Not All You Need: Aligning Perception with Language Models
		
			Paper
			
•
			2302.14045
			
•
			Published
				
			
			 
	
	 
	
	
	
			
			PanGu-Σ: Towards Trillion Parameter Language Model with Sparse
  Heterogeneous Computing
		
			Paper
			
•
			2303.10845
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			BloombergGPT: A Large Language Model for Finance
		
			Paper
			
•
			2303.17564
			
•
			Published
				
			•
				
				26
			
 
	
	 
	
	
	
			
			PMC-LLaMA: Towards Building Open-source Language Models for Medicine
		
			Paper
			
•
			2304.14454
			
•
			Published
				
			
			 
	
	 
	
	
	
			
			StarCoder: may the source be with you!
		
			Paper
			
•
			2305.06161
			
•
			Published
				
			•
				
				31
			
 
	
	 
	
	
	
			
			OctoPack: Instruction Tuning Code Large Language Models
		
			Paper
			
•
			2308.07124
			
•
			Published
				
			•
				
				31
			
 
	
	 
	
	
	
			
			TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
		
			Paper
			
•
			2312.16862
			
•
			Published
				
			•
				
				31
			
 
	
	 
	
	
	
			
			GeoGalactica: A Scientific Large Language Model in Geoscience
		
			Paper
			
•
			2401.00434
			
•
			Published
				
			•
				
				10
			
 
	
	 
	
	
	
			
			TinyLlama: An Open-Source Small Language Model
		
			Paper
			
•
			2401.02385
			
•
			Published
				
			•
				
				94
			
 
	
	 
	
	
	
			
			DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
		
			Paper
			
•
			2401.02954
			
•
			Published
				
			•
				
				48
			
 
	
	 
	
	
	
		
			Paper
			
•
			2401.04088
			
•
			Published
				
			•
				
				159
			
 
	
	 
	
	
	
			
			MoE-Mamba: Efficient Selective State Space Models with Mixture of
  Experts
		
			Paper
			
•
			2401.04081
			
•
			Published
				
			•
				
				73
			
 
	
	 
	
	
	
			
			DeepSeekMoE: Towards Ultimate Expert Specialization in
  Mixture-of-Experts Language Models
		
			Paper
			
•
			2401.06066
			
•
			Published
				
			•
				
				56
			
 
	
	 
	
	
	
			
			WizardCoder: Empowering Code Large Language Models with Evol-Instruct
		
			Paper
			
•
			2306.08568
			
•
			Published
				
			•
				
				28
			
 
	
	 
	
	
	
			
			ChatQA: Building GPT-4 Level Conversational QA Models
		
			Paper
			
•
			2401.10225
			
•
			Published
				
			•
				
				36
			
 
	
	 
	
	
	
			
			Orion-14B: Open-source Multilingual Large Language Models
		
			Paper
			
•
			2401.12246
			
•
			Published
				
			•
				
				14
			
 
	
	 
	
	
	
			
			DeepSeek-Coder: When the Large Language Model Meets Programming -- The
  Rise of Code Intelligence
		
			Paper
			
•
			2401.14196
			
•
			Published
				
			•
				
				66
			
 
	
	 
	
	
	
			
			Weaver: Foundation Models for Creative Writing
		
			Paper
			
•
			2401.17268
			
•
			Published
				
			•
				
				45
			
 
	
	 
	
	
	
			
			H2O-Danube-1.8B Technical Report
		
			Paper
			
•
			2401.16818
			
•
			Published
				
			•
				
				18
			
 
	
	 
	
	
	
			
			OLMo: Accelerating the Science of Language Models
		
			Paper
			
•
			2402.00838
			
•
			Published
				
			•
				
				84
			
 
	
	 
	
	
	
			
			GPT-NeoX-20B: An Open-Source Autoregressive Language Model
		
			Paper
			
•
			2204.06745
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			CroissantLLM: A Truly Bilingual French-English Language Model
		
			Paper
			
•
			2402.00786
			
•
			Published
				
			•
				
				26
			
 
	
	 
	
	
	
			
			MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
		
			Paper
			
•
			2402.16840
			
•
			Published
				
			•
				
				26
			
 
	
	 
	
	
	
			
			MobileLLM: Optimizing Sub-billion Parameter Language Models for
  On-Device Use Cases
		
			Paper
			
•
			2402.14905
			
•
			Published
				
			•
				
				134
			
 
	
	 
	
	
	
			
			Nemotron-4 15B Technical Report
		
			Paper
			
•
			2402.16819
			
•
			Published
				
			•
				
				46
			
 
	
	 
	
	
	
			
			StarCoder 2 and The Stack v2: The Next Generation
		
			Paper
			
•
			2402.19173
			
•
			Published
				
			•
				
				149
			
 
	
	 
	
	
	
			
			Gemma: Open Models Based on Gemini Research and Technology
		
			Paper
			
•
			2403.08295
			
•
			Published
				
			•
				
				50
			
 
	
	 
	
	
	
			
			Gemini 1.5: Unlocking multimodal understanding across millions of tokens
  of context
		
			Paper
			
•
			2403.05530
			
•
			Published
				
			•
				
				66
			
 
	
	 
	
	
	
			
			Sailor: Open Language Models for South-East Asia
		
			Paper
			
•
			2404.03608
			
•
			Published
				
			•
				
				21
			
 
	
	 
	
	
	
			
			OpenELM: An Efficient Language Model Family with Open-source Training
  and Inference Framework
		
			Paper
			
•
			2404.14619
			
•
			Published
				
			•
				
				126
			
 
	
	 
	
	
	
			
			Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
  Phone
		
			Paper
			
•
			2404.14219
			
•
			Published
				
			•
				
				258
			
 
	
	 
	
	
	
			
			Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
		
			Paper
			
•
			2404.05892
			
•
			Published
				
			•
				
				39
			
 
	
	 
	
	
	
			
			DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts
  Language Model
		
			Paper
			
•
			2405.04434
			
•
			Published
				
			•
				
				22
			
 
	
	 
	
	
	
			
			DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code
  Intelligence
		
			Paper
			
•
			2406.11931
			
•
			Published
				
			•
				
				65
			
 
	
	 
	
	
	
			
			Aya 23: Open Weight Releases to Further Multilingual Progress
		
			Paper
			
•
			2405.15032
			
•
			Published
				
			•
				
				32
			
 
	
	 
	
	
	
			
			Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts
  Language Models
		
			Paper
			
•
			2406.06563
			
•
			Published
				
			•
				
				20
			
 
	
	 
	
	
	
			
			Instruction Pre-Training: Language Models are Supervised Multitask
  Learners
		
			Paper
			
•
			2406.14491
			
•
			Published
				
			•
				
				95
			
 
	
	 
	
	
	
			
			The Llama 3 Herd of Models
		
			Paper
			
•
			2407.21783
			
•
			Published
				
			•
				
				116