"

Introduction to Evaluating AI Tools

As AI increasingly impacts higher education, we must develop deliberate, context-aware strategies for evaluating AI tools and their uses. This chapter shows that engaging with AI responsibly and effectively requires a balance of knowledge, skills, and ethical responsibility on the part of educators who need informed guidance for evaluating AI tools and tool uses. To understand this need for understanding not just the affordances of AI tools but also how the user is using them, we could imagine a see-saw where the user and the AI tool are on two sides: if the user lacks the knowledge of the subject to judge AI content, the skill to do the task with or without the tool, or a sound ethical commitment against undue reliance on automation, then the balance tips—leaving the user suspended while the machine slumps to the ground.

The processing power of AI is enormous, but that power does not justify offloading human responsibility, concern for each other and society, or the need to achieve the goals of teaching/learning and knowledge-making just to “get the job done.” Whether one is conducting research, designing assignments, evaluating learning outcomes, or integrating tools into the curriculum, using AI effectively means asking paired questions about human and machine capabilities: Do I understand the topic well enough to evaluate the AI’s contribution, and does the tool have adequate information needed to perform the task well? Am I able to use the tool proficiently, and is the tool capable of doing what I’m trying to do with it? Am I ethically grounded in how and why I’m using AI, and is the tool itself designed/behaving in ways that help me to be transparent, accountable, and just? When the focus shifts from benefiting oneself to potentially adversely impacting students, over whom institutions and instructors have power, the issues of professional ethics become significantly more important. In addressing these issues, we move away from either fear or blind enthusiasm and toward a more critical and practical approach.

Our goal in this chapter is to help educators take a systematic approach to evaluating AI tools and their use within the unique contexts of their disciplines, their teaching/learning and other purposes, and the boundaries of knowledge/skills/ethics that their contexts and purposes demand. The chapter is organized into five major sections and a conclusion.

  • Section I outlines the evaluation framework, offering strategies and reflective prompts for selecting and assessing tools in teaching and learning contexts.
  • Section II presents a series of case studies that apply the instrument to real-world evaluations, including tools used in teaching, research, and student-centered activities.
  • Section III turns attention to accessibility, representation, and alignment with DEI principles—issues that remain under-addressed in many AI platforms.
  • Section IV focuses on student input, emphasizing the importance of gathering feedback, capturing lived experiences, and preparing students to use AI with agency and critical awareness.
  • Section V discusses challenges with AI detection tools.
  • The conclusion offers reflections on iterative evaluation practices as both a pedagogical and institutional imperative.

Across these sections, we encourage readers to adopt a critical lens as they test our evaluative framework and develop their own reflective practices—not only to determine which AI tools to use or avoid and when or why, but also to build a culture of inquiry and responsibility around AI’s role in higher education.

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

AI in Action: A SUNY FACT2 Guide to Optimizing AI in Higher Education Copyright © 2025 by SUNY FACT2 Task Group on AI in Action; Kati Ahern; Nicola Marae Allain; Abigail Bechtel; Angie Chung; Billie Franchini; Meghanne Freivald; Ken Fujiuchi; Dana Gavin; Jack Harris; Keith Landa; Alla Myzelev; Victoria Pilato; Ahmad Pratama; Russell V. Rittenhouse; Carrie Solomon; Angela C. Thering; and Shyam Sharma is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.