FFmpeg devs boast of up to 94x performance boost after implementing handwritten AVX-512 assembly code

Gork@lemm.ee · 2 months ago

FFmpeg devs boast of up to 94x performance boost after implementing handwritten AVX-512 assembly code

chellomere@lemmy.world · edit-2 2 months ago

This is great, but the context is that this is for specific inner loops, and it is compared to the C version of that specific inner loop. Typically what was used before this on a computer with avx512 was the avx2 version of the inner loop, and the speedup compared to that version appears to be up to 60%: https://x.com/FFmpeg/status/1852542388851601913 . Then as not a specific inner loop isn’t run all the time, the speedup is probably much less than 60%. This is still sizeable, but the actual speedup in practice with this implementation is far far from 94x.

FFmpeg devs boast of up to 94x performance boost after implementing handwritten AVX-512 assembly code

FFmpeg devs boast of up to 94x performance boost after implementing handwritten AVX-512 assembly code

archive.ph