Attention and Transformers

(By Aleksa Sotirov)

Transformers are a particular type of deep learning model, characterised most uniquely by their use of attention mechanisms. Attention in machine learning is a technique that involves the differential weighing of the significance of different parts of input data - in essence, mimicking human attention. In particular, transformers are specialised in using self-attention to process sequential data, which makes their main applications in fields like natural language processing (NLP) and computer vision (CV). They are distinct from previous models, including recursive neural networks and long short-term memory models, by their increased parallelisation and efficiency, which is largely due to the utility of attention mechanisms.

Attention and Transformers

Contents

A brief history

Attention

Transformers

Applications

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools