Hierarchical feature analysis and dimensionality reduction in the brain

From Psyc 40 Wiki
Revision as of 07:26, 22 October 2022 by User (talk | contribs) (The nature of perceptual systems)
Jump to: navigation, search

'By Jason A. Davis

Dimensionality reduction and hierarchical feature analysis are essential techniques in understanding how the brain processes vast amounts of sensory information. Dimensionality reduction helps the brain filter out irrelevant data and focus on critical features by identifying and preserving patterns that capture the essence of input data in a reduced format. Additionally, hierarchical feature analysis complements dimensionality reduction by organizing these features in a layered or hierarchical structure. This allows the brain to build up representations from simple to complex, providing a highly efficient system for recognizing and responding to complex stimuli. Together, Together, dimensionality reduction and hierarchical feature analysis facilitate efficient information processing in the brain, especially during perception.

The nature of perceptual systems

Within every perceptual system, there is a flow of information that arrives from the environment to be transduced and interpreted by our sensory systems. These perceptual systems are not limited to the traditional senses such as sound pressure changes into sound and changes in physical light intensity as vision. Rather, the nervous system has a wide range of perceptual systems such as proprioception (the perception of one’s body in space) interoception (the conscious perception of inner bodily changes based on allostatic state) and chronoception (the conscious perception of time). These qualitative exogenous and endogenous details such as color, motion, pitch, and affect can be understood from a quantitative perspective as data, or individual units that contain raw inputs that do not carry any specific meaning (CITE). With this perspective in mind, any information type can be understood as a dataset of numerical values along specific domains. Vision can be analogous to computer pixels on a screen with certain RGB values, and audition can be analogous to a set of time-varying functions.

The sensory information received by the brain can be understood as a giant matrix of data values.

Cognitive information processing in perception relies on four neural properties: functional specialization, structural organization, hierarchical feature analysis, and distributed representations. First, different neural circuits (e.g. cells, brain networks) are specialized to solve different adaptive processes. This is implemented via neuronal tuning, where cells exhibit a selective preference for certain characteristics over others. Second, the components within a system (e.g. neurons in a circuit) arranged in an coherent, systematic manner based on their input. The brain exhibits layouts that are logically organized, facilitating seperation of input into distinct categories. Third, the brain processes information at increasing levels of abstraction. Basic, concrete components are processed in earlier brain regions, while complex, abstract components are processed in later brain regions. Finally, the brain encodes information is not encoded by one single unit, but rather through a pattern of activity in many units. The brain can then learn and encode more categories than the number of neurons in the circuit. These fundamental properties of cognitive processes in the brain can be applied from a quantitive perspective using neural networks, creating models that perform sequential computations on incoming datasets. The function of these models is constrained by their architecture and relevant rules, but at its core, we can understand the brain as a complex machine that performs subsequent steps of mathematical computations on complex sensory data, data that understood under the framework of a high-dimensional information space.

A schematic summarizing the four key properties of cognitive processing via neural systems

High-dimensional information spaces

By understanding sensory information as datasets, we can model these datasets into normal geometric space. Limited by our perceptual senses, normal dimensional spaces typically occur in three dimensions – x, y, and z – where individual data points contain three values that can be graphed with respect to their dimension, specifying their location in a geometric space. However, individual data points can contain more than three features, and these features can be beneficial to extrapolate meaning from the stimulus, such as placing a visual object into a specific category. For example, imagine one is playing a game of Guess Who? applied to animal categories, where each subject is given a set of features as inputs and must generate an output of animal category. If one was given a set of three features and asked to identify an associated category such as dogs versus cats, the features of “mammal”, “warm-blooded”, and the number of legs are not useful to distinguish these two animal categories. However, if given other features such as “calling sound”, “whiskers”, and “ability to climb trees”, this will certainly facilitate category identification. Now, this information can be modeled as datapoints in geometric space in six dimensions along these different features with values ascribed to these features. Notably, these datapoints do not always have to contain continuous values with an infinite range of values as features such as whiskers and ability to climb trees can be encoded as binary values. Regardless, each dimension represents a measurable quantity of any kind that can be modeled in space, and the measurable quantities that the brain uses aren’t always locations in space, but values pertaining to the occurrence of certain features that can be modeled similarly in geometric space with many dimensions, otherwise known as a high-dimensional information space.

In a high-dimensional information space, each data point represents a stimulus or observation with many features associated with the object. Each individual attribute associated with the object becomes a dimension in this space, and as each data points has many features, the data points can be thought of as a vector of values corresponding to its location along a certain dimension (Haxby 2020). For example, one face might be high in the dimension of mouth size and low in the dimension of face fullness, while another might low in mouth size and high in face fullness. These two stimuli occupy different locations in the high-dimensional space due to differences in information represented (Haxby 2020).

The individual feature values of the stimulus along one certain dimension can be encoded by a population of neurons based on the firing patterns of individual neurons. Thus, the work of single neurons in conjunction can combine to produce outputs at the population level that correspond to feature values. These patterns of activity then at the population level can be represented as feature values, which then become the patterns of activity that distinguish one stimulus from another.

HFA FS.png

Neural network implementations in cognitive functions

In perception, the brain uses previous experience to establish meaning to the incoming sensations, and this interpretation or assessment of incoming signals is known as conceptualization. Conceptualization is common across perceptual systems, as variant physical codes (e.g. spatial patterns of vision) are transformed into invariant visual objects. This seems to occur at an upwards level of abstraction in the brain, from specific details in earlier areas to more general, abstract features in the later areas. Conceptualization is often implemented through categorization, or the process of grouping instances that are similar for some purpose or function. The brain functions as a powerful category machine, and humans primarily rely on categorization to make sense of the world around us. Yet, humans are also proficient in orthogonalization, or separating distinct instances within groups based on different underlying characteristics. This balance of categorization and orthogonalization is essential in perception, and research has demonstrated that humans prefer to categorize objects at the basic level in language, since it provides a nice balance of generality and specificity.

In vision, faces are represented as unique representations of an unlimited number of individuals that remain stable across different transformations such as different haircut or different viewing angle. At the same time, we can also perceive the distinct nuances in the face such as expressions independent of the roubst identity. Thus, a system for face perception must be able to represent the shared features within the data that are linked to a broader category while also representing the distinct features that distinguish instances within a broader category. Similarly, in olfaction, variant spatial codes progogated from the mitral cells through the granule cells become converted into invariant feature codes in the paleocortex, the neural system that is the simpler precursor to the neocortex implemented in various other forms of perception. What sorts of system can represent distinct physical inputs as simultaneously a member of a broader level category and a distinct instance of a subcategory? In other words, what sort of system can provide the balance of generality and specificity seen in conscious perception?

Competitive neural networks can implement this function, as networks can learn to categorize input patterns by assigning a different output neuron to each input pattern. Each output neuron can represent a particular category, and oftentimes many stimuli can be assigned to one category, which reflect similarities between stimuli. Competitive neural networks update their weight matrix through unsupervised association learning, and over the course of learning, specific input patterns are mapped to specific output categories. Patterns within a category (for example: apples and oranges: fruits vs. apples and carrots: fruits and vegetables) are more similar to patterns across categories, and competitive neural networks allow us to generalize to new circumstances without relearning the inherent structure.

Competitve neural nets.png

To discover features of the world without feedback, competitive learning relies on a winner-take—all format where only one output cells is active based on the activation of specific input patterns. Based on this local inhibition, only certain neurons are associated with others, and clusters can be formed based on learning. Incoming input feedforward signals become processed, and under this winner takes all mechanism, the overlap between representations becomes reduced and the data becomes compressed. The data becomes reduced in size to encode information more efficiently, and it takes up less space without losing essential information. For instance, a face may be represented by 1 million nerves in the optic nerve, but at the end of the ventral pathway, the representation has been compressed so that information about which face is present can be represented by less than 100 neurons.

Once these signals propogate as feedback, these input neurons that were co-activated with the output neurons become active again, except it is only strongest co-activated neurons that win the competition while the others become suppressed through the winner takes all mechanism. Thus, this allows us to find the features that most strongly predict the category and reinstaniate those features into cognitive processing. The shared features between categories do not efficiently predict the category since they are linked to many different categories, and as such, do not provide enough information to segregate the categories. Rather, it is in the distinct features between categories that propagate from the feedback signals, linking the category to a distinct instance with distinct features. Thus, incoming feedforward processing can be understood as categorization while the downward feedback processing can be understood as orthogonalization, allowing the mind to maintain the balance of generality and specificity.

This entire process is known as hierarchical clustering, and competitive neural networks serve as a mathematical implementation of how the brain conducts the hierarchical feature analysis seen in perception. Competitive neural networks with multiple layers can conduct this hierarchical feature analysis like how the visual system can combine certain features such as color, orientation, and motion to represent higher-level objects. Through this process, the dimensionality of the input is sequentially reduced, retaining enough essential information while increasing storage space for new representations. These networks can then store millions of representations of unique categories with respect to their shared features and distinct characteristics. Thus, the structure and rules of a competitive neural network function as a strong parallel to perceptual systems in the brain.