Memorization Capacity of Multi-Head Attention in Transformers

By Sadegh Mahdavi, Renjie Liao, Christos Thrampoulidis
Published 2023-06-03 05:06:29