Discussion about this post

User's avatar
Warren Woodrich Pettine's avatar

First take: that assumes one system of ethics though, right? It seems like both the ideal outcome and the pragmatic one align in this case, where a multiplicity of models means a multiplicity of RLHF instances, leading to a multiplicity of ethical alignments. Diversity in human ethics is then is reflected in the diversity of models.

Second take: Claude does a pretty good job of being common-sense ethical. Has Anthropic cracked the code, or is it just aligned with someone who has my value system?

Expand full comment
3 more comments...

No posts