Attention and Transformers

From Psyc 40 Wiki
Revision as of 23:43, 21 October 2022 by User (talk | contribs)
Jump to: navigation, search

(By Aleksa Sotirov)

Transformers are a particular type of deep learning model, characterised most uniquely by their use of attention mechanisms. Attention in machine learning is a technique that involves the differential weighing of the significance of different parts of input data - in essence, mimicking human attention. In particular, transformers are specialised in using self-attention to process sequential data, which makes their main applications in fields like natural language processing (NLP) and computer vision (CV). They are distinct from previous models, including recursive neural networks and long short-term memory models, by their increased parallelisation and efficiency, which is largely due to the utility of attention mechanisms.

A brief history

Attention

Transformers

Applications

References