diff --git a/docs/source/assets/kernel/v_vec.png b/docs/source/assets/kernel/v_vec.png index bac3c109..75d344ab 100644 Binary files a/docs/source/assets/kernel/v_vec.png and b/docs/source/assets/kernel/v_vec.png differ diff --git a/docs/source/assets/kernel/value.png b/docs/source/assets/kernel/value.png index f585c77b..56b0b9e0 100644 Binary files a/docs/source/assets/kernel/value.png and b/docs/source/assets/kernel/value.png differ diff --git a/docs/source/dev/kernel/paged_attention.rst b/docs/source/dev/kernel/paged_attention.rst index 6fcadeee..ba4f7a27 100644 --- a/docs/source/dev/kernel/paged_attention.rst +++ b/docs/source/dev/kernel/paged_attention.rst @@ -447,7 +447,7 @@ Value a whole block of value tokens. And each ``accs`` in each thread contains 8 elements that accumulated at 8 different head positions. For the thread 0, the ``accs`` variable will have 8 elements, which - are 0th, 16th … 112th elements of a value head that are accumulated + are 0th, 32th … 224th elements of a value head that are accumulated from all assigned 8 tokens. LV