* Add custom ops for compatibility with PT Compile * Add support for varlen functions too * Add version checks for pytorch API * Fix PT compile interfaces so it works e2e * Make sure PT < 2.4 runs fine * Fix python mistake * Fix all the autograd magic issues * typo on head_dim * Fix deterministic test failures, remove unneeded detaches() * remove test requires_grad * Resolve all the pytorch versioning issues * C++ and python refactor to improve padding management for torch.compile() * Add improvements suggested by @anijain2305 |
||
|---|---|---|
| .. | ||
| src | ||
| flash_api.cpp | ||