- 
	
	
	
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 109 - 
	
	
	
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 117 - 
	
	
	
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 62 - 
	
	
	
Do language models plan ahead for future tokens?
Paper • 2404.00859 • Published • 3 
Thomas Renkert
trenkert
		AI & ML interests
None yet