Hoai-Chau Tran
Search
Search
Dark mode
Light mode
Explorer
notes
Adam
AdamW
Attention with Linear Biases (ALiBi)
Auto regressive decoding
Backpropagation
Batch Norm
Convolution
Convolutional networks
Decision Tree
Deep Learning
Euler's Formula
Gradient Descent
Group-Query Attention
Index
kernels
KV cache
Large Language Model (llm)
Layer Norm
LLaMA
LLaMA 2
LLaMA 3.1
Multi-Query Attention
neuron networks
Perceptron
Relative Positional Encoding
Residual Connection
Rotary Position Embeddings (RoPE)
SVM
Transformer
papers
Introduction to probability for data science
RoFormer Enhanced Transformer with Rotary Position Embedding
Towards Efficient Generative Large Language Model Serving A Survey from Algorithms to Systems
Train Short, Test Long Attention with Linear Biases Enables Input Length Extrapolation
Home
❯
papers
Folder: papers
4 items under this folder.
Sep 15, 2024
Introduction to probability for data science
math
prob_stats
Sep 15, 2024
Towards Efficient Generative Large Language Model Serving A Survey from Algorithms to Systems
llm
compression
decoding_algorithms
GPU
Sep 15, 2024
Train Short, Test Long Attention with Linear Biases Enables Input Length Extrapolation
paper
Sep 15, 2024
RoFormer Enhanced Transformer with Rotary Position Embedding
paper