1/15 NLP history
2/15 NLP history

The history of NLP can be simply divided into two eras: before and after ML(machine learning).

What divides these two eras is the innovation of using ML models to solve NLP problems,

3/15 NLP history

specifically neural networks and deep learning in NLP.

4/15 NLP history

The journey of NLP can be traced back to over 70 years ago in the late 1940s.

5/15 NLP history

In 1949, people started the idea of using machines to help translation.

The primary methods back then were based on either hand-coded rules or statistical reasoning.

6/15 NLP history

Natural language models marked the innovation of applying machine learning in NLP in 2001.

Natural language models use neural networks to predict the next word, given the previous words.

It also introduces word embedding, a key technique that uses vectors to represent given words.

7/15 NLP history

Motivated by the advances of computer hardware, multi-task learning appeared in 2008.

It allows training models to solve more than one learning task, such as entity recognition and topic classification, based on a set of shared parameters.

8/15 NLP history

2013 and 2014 mark the emergence of multiple new NLP models or architectures enabled by neural networks.

Some models produced remarkable results that are widely used today.

They are:

  • recurrent neural networks (RNNs), which were soon replaced by its variants

  • long-short term memory (LSTM) networks and

  • Gated Recurrent Units (GRUs);

  • convolutional neural networks (CNNs); and

  • sequence-to-sequence models.

9/15 NLP history

In 2015, the attention mechanism was introduced to identify only the most relevant section instead of the entire sentence to be processed in the neural network.

It solves the blockage problem of the fixed-length encoding vector that the previous NLP models had.

Additionally, it improves the model performance by focusing on only the most relevant information to accomplish a task.

10/15 NLP history

The attention mechanism generated a new state-of-the-art architecture called large language models, or pre-trained language models, in 2017.

Large pre-trained language models are designed to pre-train general language models with large amounts of data and parameters, which can be fine-tuned later for more specific tasks.

11/15 NLP history

This trend triggered a series of modules or libraries such as Transformers by Google Brain in 2017,

12/15 NLP history

Google also created BERT (or Bidirectional Encoder Representations from Transformers) in 2018,

13/15 NLP history

T5 in 2019,

14/15 NLP history

GPT-3 (or the third-generation Generative Pre-trained Transformer) by Open AI in 2020,

15/15 NLP history

and PaLM (or Pathways Language Models) by Google.