mirror of https://github.com/FFmpeg/FFmpeg.git
This version is able to output multiple coefficients at a time and is able to altogether remove actual golomb code parsing. Its also able to partially recover the last coefficient in case the packet is incomplete. Total decoder performance gain for 8bit 420 1080p lossless: 40%. Total decoder performance gain for 10bit 420 1080p lossless: 40%. clang was able to vectorize the loop much better than my handwritten assembly, but gcc was very naive and didn't. Lookup table is a rewritten version of vc2hqdecode.pull/336/head
parent
d778be6e4a
commit
675bb1f4f9
3 changed files with 1102 additions and 249 deletions
File diff suppressed because it is too large
Load Diff
Loading…
Reference in new issue