graykode / nlp-tutorial
Natural Language Processing Tutorial for Deep Learning Researchers
AI Architecture Analysis
This repository is indexed by RepoMind. By analyzing graykode/nlp-tutorial in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.
Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.
Repository Summary (README)
Previewnlp-tutorial
<p align="center"><img width="100" src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/11/TensorFlowLogo.svg/225px-TensorFlowLogo.svg.png" /> <img width="100" src="https://media-thumbs.golden.com/OLqzmrmwAzY1P7Sl29k2T9WjJdM=/200x200/smart/golden-storage-production.s3.amazonaws.com/topic_images/e08914afa10a4179893eeb07cb5e4713.png" /></p>nlp-tutorial is a tutorial for who is studying NLP(Natural Language Processing) using Pytorch. Most of the models in NLP were implemented with less than 100 lines of code.(except comments or blank lines)
- [08-14-2020] Old TensorFlow v1 code is archived in the archive folder. For beginner readability, only pytorch version 1.0 or higher is supported.
Curriculum - (Example Purpose)
1. Basic Embedding Model
- 1-1. NNLM(Neural Network Language Model) - Predict Next Word
- Paper - A Neural Probabilistic Language Model(2003)
- Colab - NNLM.ipynb
- 1-2. Word2Vec(Skip-gram) - Embedding Words and Show Graph
- 1-3. FastText(Application Level) - Sentence Classification
- Paper - Bag of Tricks for Efficient Text Classification(2016)
- Colab - FastText.ipynb
2. CNN(Convolutional Neural Network)
- 2-1. TextCNN - Binary Sentiment Classification
3. RNN(Recurrent Neural Network)
- 3-1. TextRNN - Predict Next Step
- Paper - Finding Structure in Time(1990)
- Colab - TextRNN.ipynb
- 3-2. TextLSTM - Autocomplete
- Paper - LONG SHORT-TERM MEMORY(1997)
- Colab - TextLSTM.ipynb
- 3-3. Bi-LSTM - Predict Next Word in Long Sentence
- Colab - Bi_LSTM.ipynb
4. Attention Mechanism
- 4-1. Seq2Seq - Change Word
- 4-2. Seq2Seq with Attention - Translate
- 4-3. Bi-LSTM with Attention - Binary Sentiment Classification
- Colab - Bi_LSTM(Attention).ipynb
5. Model based on Transformer
- 5-1. The Transformer - Translate
- 5-2. BERT - Classification Next Sentence & Predict Masked Tokens
Dependencies
- Python 3.5+
- Pytorch 1.0.0+
Author
- Tae Hwan Jung(Jeff Jung) @graykode
- Author Email : nlkey2022@gmail.com
- Acknowledgements to mojitok as NLP Research Internship.