Flash Attention in Docker on AMD is Not Yet Working

Below are my notes on the efforts I've made to get it working.

FROM rocm/pytorch-nightly:latest
COPY . .
RUN git clone --recursive https://github.com/ROCm/flash-attention.git /tmp/flash-attention
WORKDIR /tmp/flash-attention
ENV MAX_JOBS=8
RUN pip install -v .

Resources

What is Flash-attention? (How do i use it with Oobabooga?) :...
Adding flash attention to one click installer · Issue #4015 ...
Accelerating Large Language Models with Flash Attention on A...
GitHub - Dao-AILab/flash-attention: Fast and memory-efficien...
GitHub - ROCm/llvm-project: This is the AMD-maintained fork ...
GitHub - ROCm/AITemplate: AITemplate is a Python framework w...
Stable diffusion with RX7900XTX on ROCm5.7 · ROCm/composable...
Current state of training on AMD Radeon 7900 XTX (with bench... Current state of training on AMD Radeon 7900 XTX (with benchmarks) rLocalLLaMA
llm-tracker - howto/AMD GPUs
RDNA3 support · Issue #27 · ROCm/flash-attention · GitHub
GitHub - ROCm/xformers: Hackable and optimized Transformers ...
[ROCm] support Radeon™ 7900 series (gfx1100) without using...

1.9 KiB Raw Blame History

Flash Attention in Docker on AMD is Not Yet Working

Resources

1.9 KiB

Raw Blame History