Memorization Capacity of Multi-Head Attention in Transformers By Sadegh Mahdavi, Renjie Liao, Christos Thrampoulidis Published 2023-06-03 05:06:29