CogBench is a project bridging machine learning and human-centered disciplines.

We aim to broadly study cognition across both language models and humans.

Members

We are an interdisciplinary group studying cognition from various research fields, such as Psychology, English, Educational Psychology, and Computer Science.

Student Members: Karin de Langis*, Jong Inn Park, Bin Hu, Stuti Shah

Contributors: Andreas Schramm (Lingusitics), Andrew Elfenbein (English), Mike Mensink (Psychology)

Principal Investigator: Dongyeop Kang

Objective

The project’s goals include:

  1. Identify and formalize cognitive and linguistic distinctions between language models and humans
  2. Test theories/models/frameworks developed in human studies within an artificial intelligence setting
  3. Build foundation for understanding 'internals' of LLM cognition - more predictive power, interpretability for LLM behavior
  4. Explore feasibility of LLM assistance in materials development and validation for human studies


Join our project!

We are currently expanding our team to include more researchers in psychology, linguistics, cognitive science, and related fields. If you are interested in collaborating and co-authoring with us, please be in touch.

The best way to contact us is by filling out the collaboration interest form.

We are looking for collaborators willing to contribute data, provide feedback on our data annotation tool, and/or co-author our paper.

Demo Video

View this video for a demo of the data collection web interface. We will provide the login credentials once you fill out our interest form!

Link to Video

Example Output Data

How will data be used?

Curated Dataset Benchmark (V0)

  • ~100-200 input/outputs categorized by cognitive process
  • LLM outputs interpreted by experts contributors who have advanced education in relevant field (professors, post docs, PhD candidates)
  • Strong theory, reliable

Open Dataset Benchmark (V1)

  • ~500+ input/output
  • Data is provided by anyone*
  • Outputs can be useful to researchers in human-focused fields
  • Provide basis of cognitive LLM evaluation

LLM Evaluation

  • Compare LLM and human performance
  • Does model type, size affect different aspects of cognition?
  • Do different architectures have different cognitive capabilities?
  • Interpretability/explainability analysis

Primary Contacts

Reach out if you have any questions, comments, or concerns!

Karin de Langis - dento019@umn.edu

Dr. Dongyeop Kang - dongyeop@umn.edu