SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests

At a glance
- AI agents are moving into social contexts. When agents manage calendars, negotiate purchases, or interact with other agents on a user’s behalf, they need more than task competence—they need social reasoning.
- SocialReasoning-Bench evaluates that ability. The benchmark tests whether an agent can negotiate for