torch
local-attention
product-key-memory
mixture-of-experts>=0.2.0
axial-positional-embedding>=0.1.0
