From 44303ff8e04e496e0f545d92a60e9758b971e314 Mon Sep 17 00:00:00 2001 From: SAM <60264918+SAM-DEV007@users.noreply.github.com> Date: Fri, 31 May 2024 18:45:07 +0530 Subject: [PATCH] Update Transformers.md Extended Introduction --- contrib/machine-learning/Transformers.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/contrib/machine-learning/Transformers.md b/contrib/machine-learning/Transformers.md index d30bd63..97346bd 100644 --- a/contrib/machine-learning/Transformers.md +++ b/contrib/machine-learning/Transformers.md @@ -1,8 +1,9 @@ # Transformers ## Introduction A transformer is a deep learning architecture developed by Google and based on the multi-head attention mechanism. It is based on the softmax-based attention -mechanism. Before transformers, predecessors of attention mechanism were added to gated recurrent neural networks, such as LSTMs and gated recurrent units (GRUs), which -processed datasets sequentially. Dependency on previous token computations prevented them from being able to parallelize the attention mechanism. +mechanism. Before transformers, predecessors of attention mechanism were added to gated recurrent neural networks, such as LSTMs and gated recurrent units (GRUs), which processed datasets sequentially. Dependency on previous token computations prevented them from being able to parallelize the attention mechanism. + +Transformers are a revolutionary approach to natural language processing (NLP). Unlike older models, they excel at understanding long-range connections between words. This "attention" mechanism lets them grasp the context of a sentence, making them powerful for tasks like machine translation, text summarization, and question answering. Introduced in 2017, transformers are now the backbone of many large language models, including tools you might use every day. Their ability to handle complex relationships in language is fueling advancements in AI across various fields. ## Model Architecture