Created
May 14, 2024 03:30
-
-
Save vyeevani/2b1509c3e610b6319a4fb4e4137cc9e1 to your computer and use it in GitHub Desktop.
diffusion learning with autoregressive perceivers
Author
Author
I wanted to get down my view on why this is needed:
desired_poses = input_project(einops.pack([self.desired_poses_start, desired_poses], "* r d")[0][:-1], self.desired_poses_positional_embedder)
The basic premise is that the future actions can view the denoised prior actions while making their own decision. This allows for autoregressive action prediction while we are still doing diffusion. Generalizing the perceiver has been challenging because I'm not sure how to properly abstract this without leaking details of how diffusion + autoregressive generation leaks into this.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I should also mention that this uses x_0 prediction directly. For small dataset/models like this, epsilon prediction is too challenging since there's just generally not enough data to pull noise out given the long noise schedule that's required. Of course the code should be fairly flexible in that you can easily swap the loss function to predict noise directly if you desire. and yes I'm aware that the model will suffer posterior collapse, I just decided it was fine and I'll have to make some progress without getting bogged down in hyper param tuning hell.