Heuristics for Conversational Agent Evaluation and Design
Research questions:
How well do existing usability heuristics apply to the design of conversational agents?
Can we develop a set of heuristics that are more applicable and useful for conversational agent interface design?
Description:
Conversational interfaces have risen in popularity as businesses and users adopt a range of conversational agents, including chatbots and voice assistants. Although guidelines have been proposed, there is not yet an established set of usability heuristics to guide and evaluate conversational agent design. In this work, we propose a set of heuristics for conversational agents adapted from Nielsen’s heuristics and based on expert feedback. We then validate the heuristics through two rounds of evaluations conducted by participants on two conversational agents, one chatbot and one voice-based personal assistant. We find that, when using our heuristics to evaluate both interfaces, evaluators were able to identify more usability issues than when using Nielsen’s heuristics. We propose that our heuristics successfully identify issues related to dialogue content, interaction design, help and guidance, human-like characteristics, and data privacy.
Publications:
Langevin, R., Lordon, R., Avrahami, T., Cowan, B., Hirsch, T., and Hsieh, G. “Heuristic Evaluation of Conversational Agents.” In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI), p. 1-15, 2021. [Paper] [Medium Blog Post]
Implementation of Conversational Agents in Clinical Settings
Research questions:
How do patients rate the acceptability, feasibility, and appropriateness of a chatbot implementation for social needs screening?
What are patient perceptions of using a chatbot for social needs screening?
Description:
Patient and provider-facing screening tools for social determinants of health have been explored in a variety of contexts; however, effective screening and resource referral remain challenging, and less is known about how patients perceive chatbots as potential social needs screening tools. We investigated patient perceptions of a chatbot for social needs screening using three implementation outcome measures: acceptability, feasibility, and appropriateness. We implemented a chatbot for social needs screening at one large public hospital emergency department (ED) and used concurrent triangulation to assess perceptions of the chatbot use for screening. A total of 350 ED visitors completed the social needs screening and rated the chatbot on implementation outcome measures, and 22 participants engaged in follow-up phone interviews.
Publications:
Langevin, R., Berry, A., Zhang, J., Fockele, C. E., Anderson, L., Hsieh, D., Hartzler, A., Duber, H. C., and Hsieh, G. "Implementation fidelity of chatbot screening for social needs in the emergency department: Acceptability, feasibility, appropriateness." In Applied Clinical Informatics (ACI), 2023. [Paper]
Kocielnik, R., Langevin, R., George, J. S., Akenaga, S., Wang, A., Jones, D., Argyle, A., Fockele, C., Anderson, L., Hsieh, D., Yadav, K., Duber, H., Hsieh, G., and Hartzler, A. "Can I Talk to You about Your Social Needs? Understanding Preference for Conversational User Interface in Health." In 3rd Conference on Conversational User Interfaces (CUI), p. 1-10, 2021. [Paper]
Parkinson's Analysis with Remote Tasks (PARK)
Description:
There are about 900,000 people with Parkinson’s disease (PD) in the United States. Even though there are benefits of early treatment, unfortunately, over 40% of individuals with PD over 65 years old do not see a neurologist. In this work, we develop the PARK (Parkinson’s Analysis with Remote Kinetic-tasks) system. PARK instructs and guides users through six motor tasks and one audio task selected from the standardized MDS-UPDRS rating scale and records their performance via webcam. An initial experiment was conducted with 127 participants with PD and 127 age-matched controls, in which a total of 1,778 video recordings were collected. We explored objective differences between those with and without PD. A novel motion feature based on the Fast Fourier Transform (FFT) of optical flow in a region of interest was designed to quantify these differences in the collected video recordings. Additionally, we found that facial action unit AU4 (brow lowerer) was expressed significantly more often, while AU12 (lip corner puller) was expressed less often in various tasks for participants with PD. 90.6% of the PD participants agreed that PARK was easy to use, and 93.7% mentioned that they would use the system in the future.
Contributions:
Development of an online framework which explores the feasibility of the remote assessment of PD using a standard webcam and microphone.
Design and validation of a novel video analysis algorithm to measure a motion metric which identifies differences between participants with and without PD.
Publications:
Ali, M. R., Sen, T., Li, Q., Langevin, R., Myers, T., Dorsey, R., Sharma, S., and Hoque, M. E. “Analyzing Head Pose in Remotely-Collected Videos of People with Parkinson's Disease.” In ACM Transactions on Computing for Healthcare (HEALTH), 2021. [Paper]
Langevin, R., Ali, M. R., Sen, T., Snyder, C., Myers, T., Dorsey, R., and Hoque, M. E. “The PARK Framework for Automated Analysis of Parkinson's Disease Characteristics.” In Proceedings of ACM on Interactive, Mobile, Wearable, and Ubiquitous Computing (IMWUT), vol. 3, no. 2, June 2019. [Paper] [Online Framework]
Expressive Biofeedback Chat
Research questions:
Are people comfortable sharing their biofeedback data, such as brain wave activity or heart rate variability?
How do people perceive text animations that convey biofeedback data?
Description:
Expressive biofeedback is the use of physiological data to enable a deeper understanding in social interactions. In this pilot study, I explored user perceptions of biofeedback data and how to share biofeedback data through messages with kinetic typography in a web chat. Using the Empatica E4 wristband sensor and Muse brainsensing headset, seven participants monitored their heart rate variability and EEG signals while answering a variety of personal questions.
I then developed an instant messenger to share more understandable biofeedback data using Muse and the E4 wristband to animate messages based on a user's physiological state. In a follow-up study, I surveyed 22 participants to understand their perceptions of the text animations and associated emotions.
Contributions:
Developed a chat interface that animates text to convey the user's emotional state using EEG signals, heart rate and electrodermal activity being sensed in real-time.