Skip to content

Instantly share code, notes, and snippets.

@Algomancer
Created January 10, 2026 17:51
Show Gist options
  • Select an option

  • Save Algomancer/3b99eb8362b1487fd552ee03dfb17398 to your computer and use it in GitHub Desktop.

Select an option

Save Algomancer/3b99eb8362b1487fd552ee03dfb17398 to your computer and use it in GitHub Desktop.
class CrossPositionAttentionStem(nn.Module):
def __init__(self, dim):
super().__init__()
self.scale = dim ** -0.5
def forward(self, pe_q, pe_k, kv):
# Compute position weighted attention
attn = pe_q @ pe_k.transpose(-2, -1) * self.scale
attn = attn.softmax(dim=-1)
return attn @ kv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment