activations
Aidan
Artem
Ashish
adamw
AdamW
Babenko
bert
cardinalities
CategoricalFeatureTokenizer
CLS
CLSToken
Colab
devlin
embeddings
ensembling
FeatureTokenizer
FTTransformer
GEGLU
glu
gorishniy
Hao
hyperparameter
Illia
Jakob
Khabsa
Khrulkov
linformer
logits
Llion
Lukasz
Madian
multihead
MLP
MultiheadAttention
multilayer
Niki
nn
Noam
NumericalFeatureTokenizer
optim
optimizers
Parmar
Perceptron
Polosukhin
Pre
rtdl
runtime
ReGLU
ResNet
Rubachev
shazeer
Sinong
Tokenizer
Toutanova
Uszkoreit
vaswani
Vaswani
wang