Submissions from openpipe.ai

		Show HN: RULER – Easily apply RL to any agent (openpipe.ai)
		81 points by kcorbitt 39 days ago \| past \| 11 comments
		Summary-RL (openpipe.ai)
		1 point by s16h 54 days ago \| past
		Everything I know about reward hacking (openpipe.ai)
		3 points by kcorbitt 68 days ago \| past
		ART·E: how we built an email research agent that beats o3 (openpipe.ai)
		3 points by kcorbitt 3 months ago \| past \| 2 comments
		PII-Redact – SOTA PII Redaction on Your Laptop (openpipe.ai)
		6 points by Arctic_fly 4 months ago \| past \| 1 comment
		Using GRPO to Beat o1, o3-mini and R1 at “Temporal Clue” (openpipe.ai)
		199 points by kcorbitt 5 months ago \| past \| 55 comments
		Analyzing OpenAI's Reinforcement Fine-Tuning: Less Data, Better Results (openpipe.ai)
		4 points by kcorbitt 7 months ago \| past
		Using reinforcement learning and $4.80 of GPU time to find the best HN post (openpipe.ai)
		217 points by kcorbitt 9 months ago \| past \| 95 comments
		DPO fine-tuning outperforms SFT (openpipe.ai)
		1 point by kcorbitt 10 months ago \| past
		OpenPipe (openpipe.ai)
		1 point by handfuloflight 10 months ago \| past
		Fine-Tuning Best Practices: Models (openpipe.ai)
		2 points by gk1 10 months ago \| past
		Fine-Tuning for Production Apps (openpipe.ai)
		2 points by ijidak 11 months ago \| past
		Fine-Tuning Best Practices Series Introduction and Chapter 1: Training Data (openpipe.ai)
		3 points by sebg 11 months ago \| past
		LLM Fine-Tuning Best Practices: Base Models Proprietary/Open Source, Large/Small (openpipe.ai)
		2 points by billmalarky 11 months ago \| past \| 1 comment
		LLM Fine-Tuning Best Practices for Training Data Curation (openpipe.ai)
		1 point by billmalarky on Aug 2, 2024 \| past \| 2 comments
		OpenPipe Mixture of Agents: Outperform GPT-4 at 1/25th the Cost (openpipe.ai)
		13 points by kcorbitt on June 20, 2024 \| past \| 2 comments
		What we've learned in 3 days of Llama 3 (openpipe.ai)
		3 points by kcorbitt on April 22, 2024 \| past
		Mixtral Curious? Comparing Mistral 7B and Mixtral for fine-tuning (openpipe.ai)
		1 point by kcorbitt on Feb 29, 2024 \| past
		S-LoRA: Serving Thousands of Models from One GPU for Fun and Profit (openpipe.ai)
		1 point by kcorbitt on Jan 18, 2024 \| past
		Mistral 7B Fine-Tune Optimized (openpipe.ai)
		234 points by tosh on Dec 20, 2023 \| past \| 103 comments
		Is AI the next crypto? Insights from HN comments (openpipe.ai)
		237 points by kcorbitt on Nov 8, 2023 \| past \| 367 comments