Diego Biurrun
5ab03e41e5
x86: h264dsp: Fix link failure with optimizations disabled
...
With optimzations disabled compilers have trouble doing dead code
elimination on 'if (foo && 0)' expressions, while 'if (0 && foo)'
still works, so use the latter to avoid problems.
Bug-Id: 707
11 years ago
Michael Niedermayer
1ace0ca60f
avcodec/x86/hevc_idct: fix function name in comment
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
plepere
9ba6b17add
avcodec/x86/hevc_idct: fix number of sse registers
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
plepere
942e22c651
avcodec/x86/hevc: add avx2 dc idct
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
eab2509f8c
avcodec/x86/h264_qpel_10bit: locally define pb_0
...
somehow old llvm-gcc manages to ignore the alignment from ff_pb_0 causing a crash on freebsd
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
476bd3c7e4
x86/dsputil: move put_signed_pixels_clamped out of bswapdsp.asm
...
It's still a dsputil function
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Diego Biurrun
fab9df63a3
dsputil: Split off global motion compensation bits into a separate context
11 years ago
Diego Biurrun
c67b449beb
dsputil: Split bswap*_buf() off into a separate context
11 years ago
James Almer
c172683bf4
x86/dsputil: remove redundant global motion compensation code
...
The SSE version has been no different than the mmx one since commit a41bf09d
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
6ec3dc97fc
x86/audiodsp: move asm code out of dsputil
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Diego Biurrun
9a9e2f1c8a
dsputil: Split audio operations off into a separate context
11 years ago
Michael Niedermayer
33f83a2157
avcodec/x86/rv40dsp_init: fix () in macros
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
a5ce608fc7
x86/blockdsp: restore author attribution
...
See commits
649c00c96d
5fecfb7d58
73b02e2460
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
08c5859f17
avcodec: add simpleauto idct
...
This will pick the "best" simple idct compatible idct
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
454c019cb5
x86/hevc_idct: fix movd parameter size in DC_ADD_INIT
...
Fixes compilation with NASM x86_64
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
fe782233aa
x86/blockdsp: move asm code out of dsputil
...
Also replace INLINE_<opt> with EXTERNAL_<opt> that were wrongly
changed by commit 2b05db4f81
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
042a82ca37
avcodec/x86/lossless_videodsp: Fix size of values read for left/left_top
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Diego Biurrun
e74433a8e6
dsputil: Split clear_block*/fill_block* off into a separate context
11 years ago
plepere
92cccb7bcd
avcodec/hevc: new idct + asm
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Christophe Gisquet
9107612818
x86util: add and use RSHIFT/LSHIFT macros
...
Those macros take a byte number as shift argument, as this argument
differs between MMX and SSE2 instructions.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Ronald S. Bultje
385a3420d1
vp9/x86: fix overwrite in ipred_vl_4x4_ssse3.
...
Fixes track ticket 3717.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Christophe Gisquet
508e7a5c16
x86: huffyuv: fix {add,diff}_int16
...
They used an extra, undeclared register. Fixes a crash in
fate-vsynth3-ffvhuff444p16
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Martin Storsjö
570d4b2186
x86: h264: Don't keep data in the redzone across function calls on 64 bit unix
...
We know that the called function (ff_chroma_inter_body_mmxext)
doesn't touch the redzone, and thus will be kept intact - thus,
this doesn't fix any bug per se.
However, valgrind's memcheck tool intentionally assumes that the
redzone is clobbered on every function call and function return
(see a long comment in valgrind/memcheck/mc_main.c). This avoids
false positives in that tool, at the cost of an extra stack pointer
adjustment.
The other alternative would be a valgrind suppression for this issue,
but that's an extra burden for everybody that wants to run libavcodec
within valgrind.
Signed-off-by: Martin Storsjö <martin@martin.st>
11 years ago
Michael Niedermayer
06f576c4ab
avcodec/x86/dct_init: fix build failure with clang && disable-optimizations
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
6d408495b5
x86/dct32: don't build ff_dct32_float_sse on x86_64
...
There's an SSE2 version already, and technically the SSE version
on x86_64 was wrong (using pshufd and pshuflw, SSE2 instructions).
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
fc8db12a73
x86/vp9: inital AVX2 intra_pred
...
tos3k-vp9-b10000.webm on a Core i5-4200U @1.6GHz
1219 decicycles in ff_vp9_ipred_dc_32x32_ssse3, 131070 runs, 2 skips
439 decicycles in ff_vp9_ipred_dc_32x32_avx2, 131070 runs, 2 skips
3570 decicycles in ff_vp9_ipred_dc_top_32x32_ssse3, 4096 runs, 0 skips
2494 decicycles in ff_vp9_ipred_dc_top_32x32_avx2, 4096 runs, 0 skips
1419 decicycles in ff_vp9_ipred_dc_left_32x32_ssse3, 16384 runs, 0 skips
717 decicycles in ff_vp9_ipred_dc_left_32x32_avx2, 16384 runs, 0 skips
2737 decicycles in ff_vp9_ipred_tm_32x32_avx, 1024 runs, 0 skips
2088 decicycles in ff_vp9_ipred_tm_32x32_avx2, 1024 runs, 0 skips
3090 decicycles in ff_vp9_ipred_v_32x32_avx, 512 runs, 0 skips
2226 decicycles in ff_vp9_ipred_v_32x32_avx2, 512 runs, 0 skips
1565 decicycles in ff_vp9_ipred_h_32x32_avx, 1024 runs, 0 skips
922 decicycles in ff_vp9_ipred_h_32x32_avx2, 1024 runs, 0 skips
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
ec98f80af4
x86/dsputil: move some mmx init code inside dsputil_init_mmx()
...
This reduces differences with the fork
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Christophe Gisquet
ccff45a0d3
apedsp: move to llauddsp
...
APE is not the sole codec using scalarproduct_and_madd_int16.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
d5c9d055ea
avcodec/x86/dsputilenc_mmx: fix build without yasm
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
625ffa1457
x86/motion_est: sad_{x, y}2_mmxext functions are bitexact
...
Only the xy2 functions aren't.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Timothy Gu
108dec3055
x86: dsputilenc: convert hf_noise*_mmx to yasm
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Several bugfixes by: Christophe Gisquet <christophe.gisquet@gmail.com>
See: [FFmpeg-devel] [WIP] [PATCH 4/4] x86: dsputilenc: convert hf_noise*_mmx to yasm
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Christophe Gisquet
dcd2a6ca36
x86: hevc_mc: remove unneeded shift
...
The immediate value may be 0.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Christophe Gisquet
09fc28aed1
x86: hevcdsp_init: fix macro usage
...
The macro was not using the parameter but unconditionally using sse4.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
e1bd40fe6b
x86/motion_est: enable sad16_sse2 on k10 CPUs
...
The check is meant for k8 CPUs. sad16_sse2 is ~20% faster than sad16_mmxext on k10.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
f128342df2
build: fix compilation of svq1enc_mmx.c with --disable-mmx
...
It's needed for ff_svq1enc_init_x86() even if simd functions are disabled.
Alternatively, svq1enc_init.c could be made and the relevant code moved there.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
4ac41a52e2
x86/huffyuvdsp: fix some prototypes
...
Remove duplicate prototypes and fix int -> intptr_t in another
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Christophe Gisquet
d136fe6fd7
x86: huffyuvdsp: fewer functions for x86_64
...
When there are 2 functions that are <= SSE2, only one is needed for x86_64.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Timothy Gu
154cee9292
x86: dsputilenc: convert ff_sse{8, 16}_mmx() to yasm
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Timothy Gu
0b6292b7b8
x86: dsputilenc: move all the function prototypes together
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Christophe Gisquet
f743fa9c7f
x86: huffyuvdsp: add_hfyu_left_pred_bgr32
...
C MMX SSE2
Cycles: 3092 1053 578
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
7be79c76d3
avcodec/huffyuvdsp: Change w to intptr in add_hfyu_median_pred() and add_hfyu_left_pred()
...
This avoids potential issues with the high 32bits being random in x86-64 asm
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Christophe Gisquet
884078d2df
x86: huffyuvdsp: add SSE2 median prediction
...
From 5010c to 4566 on lagarith YUY2.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
8c891d90ca
avcodec/x86/qpeldsp_init: Restore author attribution
...
See: 368f50359e
See: 44eb495128
, and many others
See:
similarity index 83%
copy from libavcodec/x86/dsputil_init.c
copy to libavcodec/x86/qpeldsp_init.c
index ebbf97f..8f296a1 100644
--- a/libavcodec/x86/dsputil_init.c
+++ b/libavcodec/x86/qpeldsp_init.c
@@ -1,6 +1,5 @@
/*
- * Copyright (c) 2000, 2001 Fabrice Bellard
- * Copyright (c) 2002-2004 Michael Niedermayer <michaelni@gmx.at>
+ * quarterpel DSP functions
*
* This file is part of FFmpeg.
*
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
c814a6c778
avcodec/x86/svq1enc_mmx: Add author attribution
...
See: 5900637219
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
02a3e327f1
x86/dsputilenc: add missing guards to ff_pix_sum16_xop
...
XOP support was added in Yasm 1.0.0 and Nasm 2.06, and we still
support older versions.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Christophe Gisquet
99a319c4e7
x86: huffyuvdsp: port add_bytes to yasm
...
C MMX SSE2
Cycles: 2972 587 302
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Christophe Gisquet
2267003981
x86: hpeldsp: better factorization
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
7b4c46050e
rename add_hfyu_left_prediction_int16 to add_hfyu_left_pred_int16
...
This makes the naming more consistent with the 8bit variant
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
550ae6c02f
rename add_hfyu_median_prediction_int16 to add_hfyu_median_pred_int16
...
This makes the naming more consistent with the 8bit variant
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
40a4ab8ba4
rename sub_hfyu_median_prediction_int16 to sub_hfyu_median_pred_int16
...
This makes the naming more consistent with the 8bit variant
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago