Advancing the science of intelligence
Our research spans the full spectrum of AI development—from theoretical foundations to practical applications—always with safety and reliability at the forefront.
Research Areas
AI Safety & Alignment
Developing techniques to ensure AI systems behave as intended, remain transparent in their reasoning, and align with human values across diverse contexts.
Interpretability
Building tools and methods to understand how neural networks process information, make decisions, and develop internal representations.
Cognitive Architecture
Exploring novel approaches to model design that improve reasoning, planning, and the ability to learn from limited examples.
Human-AI Collaboration
Researching how AI systems can best augment human capabilities, enhance creativity, and support complex decision-making.
Recent Publications
Scaling Laws for Cognitive Transfer in Foundation Models
Chen, M., Williams, S., et al.
Constitutional AI: Harmlessness from AI Feedback
Thompson, R., Garcia, A., et al.
Emergent Reasoning in Large Language Models
Park, J., Liu, W., et al.
Interpretable Feature Circuits in Transformer Networks
Kumar, P., Zhang, Y., et al.
Multi-Agent Cooperation in Open-Ended Environments
Anderson, L., Martinez, C., et al.