Glossary

RLHF

Reinforcement Learning from Human Feedback — a training technique that aligns LLM outputs with human preferences.

Why it matters

Business and delivery impact.

RLHF is what turns a raw language model into a useful assistant that follows instructions and avoids harmful output.

How to measure

Proof you can track.

Preference win rate, safety benchmark scores and user satisfaction ratings.

Example

What it looks like in practice.

Fine-tuning a code-generation model with developer feedback so it produces accessible component markup by default.

Keep the context connected.