Mans Rullgard
f054a82727
ARM: move NEON H264 chroma mc to a separate file
...
This allows sharing code with the rv40 version of these functions.
Signed-off-by: Mans Rullgard <mans@mansr.com>
13 years ago
Janne Grunau
42d32cf53c
rv34: NEON optimised inverse transform functions
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
13 years ago
Mans Rullgard
59807fee6d
ARM: h264dsp_neon cosmetics
...
- Replace 'ip' with 'r12'.
- Use correct size designators for vld1/vst1.
- Whitespace fixes.
Signed-off-by: Mans Rullgard <mans@mansr.com>
13 years ago
Janne Grunau
a760f530bb
ARM: make some NEON macros reusable
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
13 years ago
Mans Rullgard
3adba2de3d
ARM: fix indentation in ff_dsputil_init_neon()
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
13 years ago
Mans Rullgard
96fef6cf31
ARM: NEON put/avg_pixels8/16 cosmetics
...
This makes whitespace and register names consistent with
the style used in more recent code.
Signed-off-by: Mans Rullgard <mans@mansr.com>
13 years ago
Mans Rullgard
716f1705e9
ARM: add remaining NEON avg_pixels8/16 functions
13 years ago
Mans Rullgard
94267ddfb2
ARM: clean up NEON put/avg_pixels macros
...
Although this adds a few lines, the macro calls are less convoluted.
Signed-off-by: Mans Rullgard <mans@mansr.com>
13 years ago
Mans Rullgard
00a856e3f9
dca: ARMv6 optimised decode_blockcode()
...
This is a hand-tuned version of the code with impossible parts of
the FASTDIV function ommitted.
2-5% faster overall on Cortex-A8.
Signed-off-by: Mans Rullgard <mans@mansr.com>
13 years ago
Mans Rullgard
3a0b72dee0
ARM: remove needless .text/.align directives
...
The 'function' macro already includes the appropriate
directives.
Signed-off-by: Mans Rullgard <mans@mansr.com>
13 years ago
Mans Rullgard
8ee2b4672f
ARM: add explicit .arch and .fpu directives to asm.S
...
This prevents build errors when compiler and assembler default
targets differ. Ideally each file would declare the highest
level it requires. This is however not easily possible as it
complicates assembling pre-armv6t2 code in Thumb-2 mode.
HAVE_NEON is used as indicator for ARMv7-A since no other
symbol exists for this and NEON is only available in this
variant.
Signed-off-by: Mans Rullgard <mans@mansr.com>
13 years ago
Diego Biurrun
ce33320b30
Remove redundant filename self-references inside files.
...
Filenames are brittle across renames and add no useful information.
13 years ago
Anton Khirnov
acffe45732
mpegvideo: remove some unused variables from MpegEncContext.
13 years ago
Mans Rullgard
d4999e0a79
dca: ARMv6 optimised decode_blockcode()
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 08e3dea3f7f69309574dafc0af6671615e909720)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Ronald S. Bultje
c2d337429c
H264: change weight/biweight functions to take a height argument.
...
Neon parts by Mans Rullgard <mans@mansr.com>.
13 years ago
Baptiste Coudurier
76741b0e56
h264: 4:2:2 intra decoding support
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
13 years ago
Mans Rullgard
6308729e68
ARM: check for inline asm 'y' operand modifier support
...
The inline asm added in bf5d46d
uses the 'y' modifier which
is only supported from gcc 4.5. This check allows building
with older compilers.
Signed-off-by: Mans Rullgard <mans@mansr.com>
13 years ago
Ronald S. Bultje
a5dfeb612e
VP8: armv6 optimizations.
...
From 52.503s (~40fps) to 27.973sec (~80fps) decoding of 480p sintel
trailer, i.e. a ~2x speedup overall, on a Nexus S.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Mans Rullgard
bf5d46d8e6
dca: NEON optimised high freq VQ decoding
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
baf6b738f2
ARM: NEON optimised vector_fmac_scalar()
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Anton Khirnov
297d9cb3dc
mpeg12enc: add intra_vlc private option.
...
Deprecate CODEC_FLAG2_INTRA_VLC.
14 years ago
Michael Niedermayer
565cabf5c8
h264: Try to fix 422 intra NEON
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
14 years ago
Michael Niedermayer
95b5b525b1
h264pred_init_arm: compile hotfix
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
14 years ago
Baptiste Coudurier
231a6df9ea
h264dec: h264: 4:2:2 intra decoding
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
14 years ago
Clément Bœsch
5231454560
timecode: introduce timecode and honor it in MPEG-1/2.
...
This is based on the original work by Baptiste Coudurier.
14 years ago
Måns Rullgård
9a83adaf34
arm: Avoid using the movw instruction needlessly
...
This fixes building for ARM11 without Thumb2.
Signed-off-by: Martin Storsjö <martin@martin.st>
14 years ago
Martin Storsjö
d0a2f0af9d
Move an int64_t down in MpegEncContext
...
This allows using the same arm assembler offsets for both EABI
and the mach-o ABI.
Signed-off-by: Martin Storsjö <martin@martin.st>
14 years ago
Mans Rullgard
cbd58a872d
dsputil: remove some unused functions
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
a617c6aaa3
dsputil: update per-arch init funcs for non-h264 high bit depth
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
874f1a901d
dsputil: template get_pixels() for different bit depths
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
e7a972e113
simple_idct: add 10-bit version
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Diego Biurrun
8342a82680
arm: remove disabled function dct_unquantize_h263_inter_iwmmxt()
14 years ago
Mans Rullgard
11043d80f6
ARM: use const macro to define constant data in asm
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
4a28e26ea4
ac3enc: NEON optimised sum_square_butterfly_float
14 years ago
Mans Rullgard
a4928cf380
ac3enc: neon optimised sum_square_butterfly_int32
14 years ago
Mans Rullgard
fce1e43410
ARM: workaround for bug in GNU assembler
...
Some versions of the GNU assembler do not handle 64-bit
immediate operands containing arithmetic. Writing the
value out in full works correctly.
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
3824ef08e0
ARM: allow unaligned buffer in fixed-point NEON FFT4
...
This function is called with only 8-byte alignment from
imdct for size 16. The fft4 function is not called for
the larger FFT or MDCT sizes, so this has no impact on
typical uses.
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
5dd045ebc1
ARM: ac3: update ff_ac3_extract_exponents_neon per 8b7b2d6
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
8aa63f0b31
ARM: NEON optimised vector_clip_int32()
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
a3e1f80e8b
ARM: remove check for PLD instruction
...
PLD is present in ARMv5TE and later, which is checked for separately.
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
8986fddc2b
ARM: allow building in Thumb2 mode
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
88ff180ad6
ARM: update ff_h264_idct8_add4_neon for 4:4:4 changes
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
e897a633cd
ARM: factor some repetitive code into macros
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Jason Garrett-Glaser
c90b94424c
4:4:4 H.264 decoding support
...
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
14 years ago
Mans Rullgard
9776e25db9
ARM: jrevdct_arm: simplify stack usage
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
13743c7ab0
ARM: jrevdct_arm: use push/pop mnemonics
...
Use push/pop instead of stmdb/ldmia for stack operations. This
is the preferred syntax.
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
77cdfde73e
ARM: jrevdct_arm: misc cleanup
...
- use 'const' macro to define coeff table
- add missing endfunc
- remove superflous directives
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
5c46ad1da0
ARM: optimised mpadsp_apply_window_fixed
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
21c6512542
ARM: remove MUL64 and MAC64 inline asm
...
Current GCC versions know how to generate these instructions
properly and avoiding inline asm gives better code. The MULH
function for ARMv5 uses the same instruction and is also not
needed any more.
The MLS64 macro remains since negating an input would normally
not be allowed as it would fail for INT_MIN. In our uses, the
inputs never have this value and thus negating is safe.
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
594fbe42c6
ARM: remove MULL inline asm
...
Reasonable gcc versions get this one right on their own.
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago