python314Packages.flashinfer ‐ Nix Packages ‐ Searchix

python314Packages.flashinfer

FlashInfer is a library and kernel generator for Large Language Models that provides high-performance implementation of LLM GPU kernels such as FlashAttention, PageAttention and LoRA. FlashInfer focus on LLM serving and inference, and delivers state-of-the-art performance across diverse scenarios.

Name

flashinfer

Homepage

https://flashinfer.ai/

Version

0.3.1

License

Apache License 2.0

Maintainers

Break Yang
Daniel Fahey

Platforms

aarch64-linux
armv5tel-linux
armv6l-linux
armv7a-linux
armv7l-linux
i686-linux
loongarch64-linux
m68k-linux
microblaze-linux
microblazeel-linux
mips-linux
mips64-linux
mips64el-linux
mipsel-linux
powerpc-linux
powerpc64-linux
powerpc64le-linux
riscv32-linux
riscv64-linux
s390-linux
s390x-linux
x86_64-linux
x86_64-darwin
aarch64-darwin
aarch64-windows
x86_64-windows
i686-windows
i686-freebsd
x86_64-freebsd
aarch64-freebsd

Defined

Source