Difference between revisions of "Computational models of emotion"

Revision as of 07:27, 22 October 2022

By Jenny Oh

Summary

Artificial neural networks (ANNs) have been used in emotion recognition and expression, a process known as affective computing. Affective computing has historically been harnessed to understand and explore emotion recognition and expression in human-computer (brain-computer) interfaces [1], as well as processing different forms of data (video [2], audio [3], text [4], and physiological [5] to infer or predict emotional states. Diverse ANN structures such as deep, convolutional, and recurrent neural networks have been harnessed to achieve emotion modeling. Aside from modeling human emotion, an area of interest in the literature has been in modeling artificial emotion, both of which seek to use or validate different theories of emotion to inform a computational approach. This article focuses on the use of ANN-based emotion modeling.

Background

Emotions are physical and mental states brought on by neurophysiological changes, variously associated with thoughts, feelings, behavioral responses, and a degree of pleasure or displeasure. There is no scientific consensus on a definition. Major emotional models in psychology and physiology include the James-Lange, Cannon-bard, and Schachter-Singer theories, which implicate the physiological response and actual emotional affect in different ways (cite). Neuroscientific approaches have largely centered on neural circuits that seem active in emotional processing, such as the Papez and Yakovlev circuits (cite). In addition, systems of emotion in the brain such as the limbic system (basal ganglia, amygdala, and insular and cingulate cortexes) provide insight into affective processing. Research on emotion has involved efforts from diverse fields, including psychology, neuroscience, and computer science [6]. Models and theories of emotion are a central concern in these fields. The complexity and multidisciplinary relevance in emotion makes it an apt point of investigation in convergent areas of study like computational neuroscience. With the innovation and attention given to artificial intelligence (AI) in contemporary computer science and computational neuroscience, understanding emotion and emotion recognition has also been a major line of research that complements innovation in these fields.

Constructing Computational Models of Emotion

Computational neuroscience studies have both informed and taken cues from neuroscientific perspectives implicating specific areas of the brain including the basal ganglia [7] and striatum in constructing human emotion, as well as linking processes involved in sensorimotor, attentional, and emotional brain function [8]. The challenge in a computational model of emotion is to process and interpret emotional states in manner similar to the way humans do so, acknowledging the lack of temporal boundaries and diverse expression/perception of human emotion [9]. Various computational models of emotion have been proposed, often evidenced and informed by neurological, physiological, or psychological evidence. Combining contemporary computational methods with these insights, models often harness artificial neural networks (ANNs) and other machine learning techniques in order to analyze inputs like facial expressions, voice tone, text or language data, physiological data, and behavioral cues to generate appropriate emotional responses or to detect emotional states in users [10]. Following Picard’s (1997) idea of affective computing, defining and exploring human-computer interaction that includes emotional states, both these goals of emotion recognition and construction have helped shape computational approaches to emotion. Diverse types of ANNs and other machine learning methods, including deep neural networks (DNNs), recurrent neural networks (RNNs), and convolutional neural networks (CNNs), have been employed to recognize and construct emotion.

Deep Neural Networks (DNNs) A subset of deep learning, deep neural networks (DNNs) are ANNs with multiple layers between the input and output layers (https://en.wikipedia.org/wiki/Deep_learning#Deep_neural_networks). In modeling neural networks, the multiple layers of interconnected nodes can map neurons, and are effective for learning hierarchical data representations. This makes them useful in processing high-dimensional inputs including images and audio, as well as large datasets. In emotion recognition, these features of DNNs are used for emotion/affect sensing based on facial expression processing, voice analysis, or text sentiment analysis (https://ieeexplore.ieee.org/abstract/document/8070966, https://www.sciencedirect.com/science/article/pii/S0167865518301302, https://ieeexplore.ieee.org/abstract/document/7532431/). DNNs allow pattern recognition in these areas as well as informing combined problem solving approaches across fields—for example, in audiovisual recognition or paralinguistics (https://ieeexplore.ieee.org/abstract/document/8070966). Using a multi-layered approach to model combined features is a major advantage in DNN-based models of emotion; aside from combining input-related features, DNNs can be used to combine different features of emotional theory (appraisal, memory, learning) ( https://doi.org/10.48550/arXiv.1808.08447). Their flexible nature makes DNNs well-suited for such complex, high-dimensional tasks, drawing from multiple input types or implementing multiple layers of processing to generate meaningful representations of emotional cues. However, capturing such complex relationships through feedforward mechanisms can lead to common training issues in DNNs. Additional layers of abstraction may contribute to overfitting with smaller or less diverse datasets, while the complexity and size of larger datasets may require substantial resources and time for training. Thus, efforts in harnessing this methodology have often focused on combining multiple neural networks and employing multiple data processing techniques, resulting in adjustments to both the models and data cleaning methods used in emotion recognition (https://dl.acm.org/doi/abs/10.1145/2522848.2531745, https://dl.acm.org/doi/abs/10.1145/2663204.2666274). Recurrent Neural Networks (RNNs) RNNs are a class of ANN designed to handle sequential data by processing data across multiple time steps (unlike feedforward neural networks (https://en.wikipedia.org/wiki/Recurrent_neural_network). Incorporating feedback loops that allow the network to retain information from previous time steps, this makes them adept at modelling and processing text, speech, and time series. In modelling emotion, RNNs are helpful for capturing temporal features of emotion, which is helpful for emotional processing in speech patterns, behavioral changes, or physiological cues over time (R. J. Williams and D. Zipser, "A learning algorithm for continually running fully recurrent neural networks", Neural Comput., vol. 1, no. 2, pp. 270-280, 1989.). Capturing temporal information is valuable in emotion recognition as human behavior is often contextualized by time cues, making this approach a valuable way to harness long-range or time-dependent data. RNNs have often been combined with Long Short Term Memory (LSTM), a type of RNN architecture useful in the classification, regression, encoding, or decoding of long sequence or time-series data (http://brainengineering.dartmouth.edu/psyc40wiki/index.php/Long_Short-Term_Memory). LSTM-RNN is a state-of-the-art modeling technique involved in emotion recognition, and has been harnessed to generate automatic audiovisual recognition through its modeling of audio and visual features (M. Wöllmer, M. Kaiser, F. Eyben, B. Schuller, G. Rigoll. LSTM-Modeling of continuous emotions in an audiovisual affect recognition framework. Image and Vision Computing, Volume 31, Issue 2, Pages 153-163, February 2013.). This approach allows the modelling of long range time dependencies while taking different forms of data into account. Especially due to its temporal features, RNNs are able to model emotional shifts and predict future emotional states based on past inputs, allowing for a more dynamic model of emotion. Though the vanishing gradient problem (https://en.wikipedia.org/wiki/Vanishing_gradient_problem) initially limited RNNs in learning long-term dependencies, combining with an LSTM architecture allows them to handle longer sequences of data as well as high-dimensional inputs. RNNs are often combined with DNN or CNN techniques to add a temporal dimension to emotion modelling. Convolutional Neural Networks (CNNs) A regularized type of feed-forward ANN, CNNs are able to learn features through filter optimization (https://en.wikipedia.org/wiki/Convolutional_neural_network). CNNs have the ability to process grid-like data such as images or videos; in emotion modelling, this makes it advantageous in recognition tasks that involve visual data (such as facial expression recognition) (. Though image or video recognition is the main area of CNN research, they have also been used in emotion classification tasks involving other types of data, including EEG signals, audio/speech, and physiological signals (heart rate, pulse, temperature, etc.)(https://www.sciencedirect.com/science/article/pii/S1746809420300501, https://ojs.aaai.org/index.php/AAAI/article/view/19105, https://ieeexplore.ieee.org/abstract/document/7953131/, EEG emotion recognition using dynamical graph convolutional neural networks, https://ieeexplore.ieee.org/abstract/document/8543567/ ). Often modeled with deep learning (Deep CNNs), emotion detection has been achieved using CNNs through classification of emotional features of arousal and valence. Their automatic learning features make CNNs an efficient method in processing spatial relationships within data, which is advantageous at surface level for image-based tasks but also for other domains once adapted for specific data. Though CNNs tend to be less capable in handling sequential or time-dependent data such as physiological or behavioral data, integrating RNN components with CNN architecture can mitigate this limitation. Additionally, its historical applications in medical image analysis and natural language processing (10.1145/1390156.1390177) as well as defining brain-computer interfaces ("Deep Learning Techniques to Improve Intraoperative Awareness Detection from Electroencephalographic Signals") in relevant fields sets precedent in using CNNs to model artificial emotion (https://www.frontiersin.org/journals/robotics-and-ai/articles/10.3389/frobt.2016.00021/full). Hybrid Models and Multimodal Approaches Most research into modeling emotion tend to integrate different model architectures to create multimodal neural networks—for example, combining CNNs for facial recognition with RNNs for speech or physiological data, allowing audiovisual emotional recognition or time-dependent and arousal/valence-based physiological emotional recognition (https://ieeexplore.ieee.org/abstract/document/8320798, ). These hybrid models can improve accuracy in emotion detection by integrating complementary sources of information. Though these architectures may require more synchronization or alignment of data sources, as well as larger or more diverse training datasets to decrease error, their integration may prove more reliable in creating more accurate and more nuanced representations of human emotion. Models of Artificial Emotion Emotions have been used to inform the construction of AI systems, agents, and robots. The challenge of modeling self-awareness in AI (https://www.sciencedirect.com/science/article/pii/S0262407915307776) is something that the “understanding,” “feeling,” or “expressing” of emotion in AI may mitigate. Though some projects have endeavored to realize emotions in robots, the issue seems to be in the feeling/functioning of, rather than the acting out of, emotions (https://www.frontiersin.org/journals/robotics-and-ai/articles/10.3389/frobt.2016.00021/full). This issue calls for computational models of emotion, but most deep learning or ANN-based models are focused on the bottom-up models of emotion processing rather than a top-down emotional system.

@@ Line 10: / Line 10: @@
 With the innovation and attention given to artificial intelligence (AI) in contemporary computer science and computational neuroscience, understanding emotion and emotion recognition has also been a major line of research that complements innovation in these fields.
-Constructing Computational Models of Emotion
+===Constructing Computational Models of Emotion===
 Computational neuroscience studies have both informed and taken cues from neuroscientific perspectives implicating specific areas of the brain including the basal ganglia [https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(00)01535-7] and striatum in constructing human emotion, as well as linking processes involved in sensorimotor, attentional, and emotional brain function [https://www.sciencedirect.com/science/article/pii/S1571064515000500?via%3Dihub#br2340]. The challenge in a computational model of emotion is to process and interpret emotional states in manner similar to the way humans do so, acknowledging the lack of temporal boundaries and diverse expression/perception of human emotion [https://link.springer.com/article/10.1007/s10462-012-9368-5].
 Various computational models of emotion have been proposed, often evidenced and informed by neurological, physiological, or psychological evidence. Combining contemporary computational methods with these insights, models often harness artificial neural networks (ANNs) and other machine learning techniques in order to analyze inputs like facial expressions, voice tone, text or language data, physiological data, and behavioral cues to generate appropriate emotional responses or to detect emotional states in users [https://doi.org/10.1016/j.inffus.2022.03.009].
-Following Picard’s (1997) idea of [https://en.wikipedia.org/wiki/Affective_computing affective computing], defining and exploring human-computer interaction that includes emotional states [Picard, Rosalind W. Affective computing. MIT press, 2000.]. Affective Computing. Cambridge, MA: MIT Press. p. 1.), both these goals of emotion recognition and construction have helped shape computational approaches to emotion. Diverse types of ANNs and other machine learning methods, including deep neural networks (DNNs), recurrent neural networks (RNNs), and convolutional neural networks (CNNs), have been employed to recognize and construct emotion.
+Following Picard’s (1997) idea of [https://en.wikipedia.org/wiki/Affective_computing affective computing], defining and exploring human-computer interaction that includes emotional states, both these goals of emotion recognition and construction have helped shape computational approaches to emotion. Diverse types of ANNs and other machine learning methods, including deep neural networks (DNNs), recurrent neural networks (RNNs), and convolutional neural networks (CNNs), have been employed to recognize and construct emotion.
 Deep Neural Networks (DNNs)

Difference between revisions of "Computational models of emotion"

Revision as of 07:27, 22 October 2022

Summary

Background

Constructing Computational Models of Emotion

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools