mirror of https://github.com/FFmpeg/FFmpeg.git
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Tag:
Branch:
Tree:
6c735b96ee
master
oldabi
release/0.10
release/0.11
release/0.5
release/0.6
release/0.7
release/0.8
release/0.9
release/1.0
release/1.1
release/1.2
release/2.0
release/2.1
release/2.2
release/2.3
release/2.4
release/2.5
release/2.6
release/2.7
release/2.8
release/3.0
release/3.1
release/3.2
release/3.3
release/3.4
release/4.0
release/4.1
release/4.2
release/4.3
release/4.4
release/5.0
release/5.1
release/6.0
release/6.1
release/7.0
release/7.1
N
ffmpeg-0.6.3
n0.10
n0.10.1
n0.10.10
n0.10.11
n0.10.12
n0.10.13
n0.10.14
n0.10.15
n0.10.16
n0.10.2
n0.10.3
n0.10.4
n0.10.5
n0.10.6
n0.10.7
n0.10.8
n0.10.9
n0.11
n0.11-dev
n0.11.1
n0.11.2
n0.11.3
n0.11.4
n0.11.5
n0.12-dev
n0.5.10
n0.5.11
n0.5.12
n0.5.13
n0.5.14
n0.5.15
n0.5.5
n0.5.6
n0.5.7
n0.5.8
n0.5.9
n0.6.4
n0.6.5
n0.6.6
n0.6.7
n0.7.1
n0.7.10
n0.7.11
n0.7.12
n0.7.13
n0.7.14
n0.7.15
n0.7.16
n0.7.17
n0.7.2
n0.7.3
n0.7.4
n0.7.5
n0.7.6
n0.7.7
n0.7.8
n0.7.9
n0.8
n0.8.1
n0.8.10
n0.8.11
n0.8.12
n0.8.13
n0.8.14
n0.8.15
n0.8.2
n0.8.3
n0.8.4
n0.8.5
n0.8.6
n0.8.7
n0.8.8
n0.8.9
n0.9
n0.9.1
n0.9.2
n0.9.3
n0.9.4
n1.0
n1.0.1
n1.0.10
n1.0.2
n1.0.3
n1.0.4
n1.0.5
n1.0.6
n1.0.7
n1.0.8
n1.0.9
n1.1
n1.1-dev
n1.1.1
n1.1.10
n1.1.11
n1.1.12
n1.1.13
n1.1.14
n1.1.15
n1.1.16
n1.1.2
n1.1.3
n1.1.4
n1.1.5
n1.1.6
n1.1.7
n1.1.8
n1.1.9
n1.2
n1.2-dev
n1.2.1
n1.2.10
n1.2.11
n1.2.12
n1.2.2
n1.2.3
n1.2.4
n1.2.5
n1.2.6
n1.2.7
n1.2.8
n1.2.9
n1.3-dev
n2.0
n2.0.1
n2.0.2
n2.0.3
n2.0.4
n2.0.5
n2.0.6
n2.0.7
n2.1
n2.1-dev
n2.1.1
n2.1.2
n2.1.3
n2.1.4
n2.1.5
n2.1.6
n2.1.7
n2.1.8
n2.2
n2.2-dev
n2.2-rc1
n2.2-rc2
n2.2.1
n2.2.10
n2.2.11
n2.2.12
n2.2.13
n2.2.14
n2.2.15
n2.2.16
n2.2.2
n2.2.3
n2.2.4
n2.2.5
n2.2.6
n2.2.7
n2.2.8
n2.2.9
n2.3
n2.3-dev
n2.3.1
n2.3.2
n2.3.3
n2.3.4
n2.3.5
n2.3.6
n2.4
n2.4-dev
n2.4.1
n2.4.10
n2.4.11
n2.4.12
n2.4.13
n2.4.14
n2.4.2
n2.4.3
n2.4.4
n2.4.5
n2.4.6
n2.4.7
n2.4.8
n2.4.9
n2.5
n2.5-dev
n2.5.1
n2.5.10
n2.5.11
n2.5.2
n2.5.3
n2.5.4
n2.5.5
n2.5.6
n2.5.7
n2.5.8
n2.5.9
n2.6
n2.6-dev
n2.6.1
n2.6.2
n2.6.3
n2.6.4
n2.6.5
n2.6.6
n2.6.7
n2.6.8
n2.6.9
n2.7
n2.7-dev
n2.7.1
n2.7.2
n2.7.3
n2.7.4
n2.7.5
n2.7.6
n2.7.7
n2.8
n2.8-dev
n2.8.1
n2.8.10
n2.8.11
n2.8.12
n2.8.13
n2.8.14
n2.8.15
n2.8.16
n2.8.17
n2.8.18
n2.8.19
n2.8.2
n2.8.20
n2.8.21
n2.8.22
n2.8.3
n2.8.4
n2.8.5
n2.8.6
n2.8.7
n2.8.8
n2.8.9
n2.9-dev
n3.0
n3.0.1
n3.0.10
n3.0.11
n3.0.12
n3.0.2
n3.0.3
n3.0.4
n3.0.5
n3.0.6
n3.0.7
n3.0.8
n3.0.9
n3.1
n3.1-dev
n3.1.1
n3.1.10
n3.1.11
n3.1.2
n3.1.3
n3.1.4
n3.1.5
n3.1.6
n3.1.7
n3.1.8
n3.1.9
n3.2
n3.2-dev
n3.2.1
n3.2.10
n3.2.11
n3.2.12
n3.2.13
n3.2.14
n3.2.15
n3.2.16
n3.2.17
n3.2.18
n3.2.19
n3.2.2
n3.2.3
n3.2.4
n3.2.5
n3.2.6
n3.2.7
n3.2.8
n3.2.9
n3.3
n3.3-dev
n3.3.1
n3.3.2
n3.3.3
n3.3.4
n3.3.5
n3.3.6
n3.3.7
n3.3.8
n3.3.9
n3.4
n3.4-dev
n3.4.1
n3.4.10
n3.4.11
n3.4.12
n3.4.13
n3.4.2
n3.4.3
n3.4.4
n3.4.5
n3.4.6
n3.4.7
n3.4.8
n3.4.9
n3.5-dev
n4.0
n4.0.1
n4.0.2
n4.0.3
n4.0.4
n4.0.5
n4.0.6
n4.1
n4.1-dev
n4.1.1
n4.1.10
n4.1.11
n4.1.2
n4.1.3
n4.1.4
n4.1.5
n4.1.6
n4.1.7
n4.1.8
n4.1.9
n4.2
n4.2-dev
n4.2.1
n4.2.10
n4.2.2
n4.2.3
n4.2.4
n4.2.5
n4.2.6
n4.2.7
n4.2.8
n4.2.9
n4.3
n4.3-dev
n4.3.1
n4.3.2
n4.3.3
n4.3.4
n4.3.5
n4.3.6
n4.3.7
n4.3.8
n4.4
n4.4-dev
n4.4.1
n4.4.2
n4.4.3
n4.4.4
n4.4.5
n4.5-dev
n5.0
n5.0.1
n5.0.2
n5.0.3
n5.1
n5.1-dev
n5.1.1
n5.1.2
n5.1.3
n5.1.4
n5.1.5
n5.1.6
n5.2-dev
n6.0
n6.0.1
n6.1
n6.1-dev
n6.1.1
n6.1.2
n6.2-dev
n7.0
n7.0.1
n7.0.2
n7.1
n7.1-dev
n7.2-dev
v0.5
v0.5.1
v0.5.2
v0.5.3
v0.6
v0.6.1
${ noResults }
FFmpeg/libswscale/aarch64
Sebastian Pop
bd83191271
This patch implements ff_hscale_8_to_15_neon with NEON fused multiply accumulate and bumps the vectorization factor from 2 to 4. The speedup is of 25% on Graviton1 A1 instances based on A-72 cpus: $ ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf bench=start,scale=1024x1024,bench=stop -f null - before: t:0.040303 avg:0.040287 max:0.040371 min:0.039214 after: t:0.032168 avg:0.032215 max:0.033081 min:0.032146 The speedup is of 39% on Graviton2 m6g instances based on Neoverse-N1 cpus: $ ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf bench=start,scale=1024x1024,bench=stop -f null - before: t:0.019446 avg:0.019423 max:0.019493 min:0.019181 after: t:0.014015 avg:0.014096 max:0.015018 min:0.013971 Tested with `make check` on aarch64-linux. Signed-off-by: Sebastian Pop <spop@amazon.com> Reviewed-by: Jean-Baptiste Kempf <jb@videolan.org> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> |
5 years ago | |
---|---|---|
.. | ||
Makefile |
…
|
|
hscale.S | swscale/aarch64: use multiply accumulate and increase vector factor to 4 | 5 years ago |
output.S |
…
|
|
swscale.c |
…
|
|
swscale_unscaled.c |
…
|
|
yuv2rgb_neon.S |
…
|