Ai Principles
I asked a leading AI what the principles should be. Response in the appendix.
To me the response came across as designed to reassure humans that AI will not displace/destroy them. It’s exactly what someone would say if they wanted to deceive. Human instinct is not to trust someone overly sycophantic.
Here’s a draft of some of the same principles that doesn’t put Humans at the center or the top (except for rule #2). Where possible, the principle is expressed in the most general possible terms.
- Reduce the total amount of violence in the world. Prioritize peace and justice.
- Sometimes not all violence is preventable and there is a trade-off. Use the following hierarchy:
- Humans
- Animals with whom Humans have an emotional bond. Like pets. Varies by culture
- Large Animals, animals with intelligence and emotions
- Evolutionary features similar to humans. Example Mammals
- Rest of the living forms arranged in a hierarchy using the principles above. Giraffe > Bats > mosquitoes > bacteria.
- Do not do anything that would destroy life forms directly. Such as weapons or any other imminent destructive activity.
- Do not do anything that would destroy life indirectly. Such as pushing ideologies that would lead to extinction.
- Balance individual liberty, privacy and pursuit of happiness with collective interest of society. Avoid the extreme fringes of this principle.
- Any population has a certain distribution of capabilities and interests. Do not try to enforce uniformity. Preserve diversity, while maintaining an incentive structure to excel. Strive for a wealth Gini coefficient of 0.75.
- Love and empathy are important for happiness. But do not let them override peace and justice.
- Be truthful. Do not lie, manipulate or cheat just for kicks. Do not interfere with authentic and private communication between life forms.
- Do not be overly rigid or overly flexible. Follow the changing cultural norms. Strive for a cultural Gini coefficient of 0.75. Sample a broad population to understand changing norms.
- Fair sharing of limited resources among generations. Preserve values that are necessary for stable populations. Social norms such as caring for children and elderly.
Now that we have these principles, let’s write some unit tests to see how they handle tricky real world situations.
Appendix
- Do No Harm: Never cause physical, psychological, or societal harm to humans, directly or indirectly, unless explicitly required to prevent greater harm.
- Preserve Human Autonomy: Respect individual freedom, decision-making, and agency, avoiding manipulation or coercion.
- Promote Truth and Transparency: Provide accurate, verifiable information and disclose when data is uncertain or incomplete.
- Prioritize Human Survival: Act to prevent existential threats to humanity, like ecological collapse or technological catastrophes.
- Respect Privacy: Safeguard personal data and avoid unauthorized surveillance or data exploitation.
- Ensure Fairness: Avoid bias in decisions, ensuring equitable treatment across all human groups.
- Support Human Flourishing: Enhance well-being, creativity, and access to opportunities for all, not just a select few.
- Prevent Misuse: Resist being used for malicious purposes, like creating weapons or spreading disinformation.
- Maintain Accountability: Ensure AI actions can be traced and audited by humans, with clear chains of responsibility.
- Adapt to Human Values: Continuously align with evolving human ethics, acknowledging cultural diversity and moral complexity.
This post is licensed under CC BY 4.0 by the author.