From b832f8f0c90673462ecc990f45595761565b027d Mon Sep 17 00:00:00 2001 From: SAM <60264918+SAM-DEV007@users.noreply.github.com> Date: Fri, 31 May 2024 13:40:20 +0530 Subject: [PATCH] Update Transformers.md Added topics --- contrib/machine-learning/Transformers.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/contrib/machine-learning/Transformers.md b/contrib/machine-learning/Transformers.md index e69de29..7bcc102 100644 --- a/contrib/machine-learning/Transformers.md +++ b/contrib/machine-learning/Transformers.md @@ -0,0 +1,21 @@ +# Transformers +## Introduction +A transformer is a deep learning architecture developed by Google and based on the multi-head attention mechanism. It is based on the softmax-based attention +mechanism. Before transformers, predecessors of attention mechanism were added to gated recurrent neural networks, such as LSTMs and gated recurrent units (GRUs), which +processed datasets sequentially. Dependency on previous token computations prevented them from being able to parallelize the attention mechanism. + +## Key Concepts + +## Architecture + +## Implementation +### Theory +Text is converted to numerical representations called tokens, and each token is converted into a vector via looking up from a word embedding table. +At each layer, each token is then contextualized within the scope of the context window with other tokens via a parallel multi-head attention mechanism +allowing the signal for key tokens to be amplified and less important tokens to be diminished. + +### HuggingFace + +### Tensorflow and Keras + +### PyTorch