FFmpeg

Commit Graph

Author	SHA1	Message	Date
Andreas Rheinhardt	428ff7bd8c	swscale/ppc/swscale_ppc_template: Reindent after the previous commit Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	9 months ago
Andreas Rheinhardt	95b4aea5e3	swscale/ppc/swscale_ppc_template: Remove code not passing checkasm Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	9 months ago
Andreas Rheinhardt	f3c197b129	Include attributes.h directly Some files currently rely on libavutil/cpu.h to include it for them; yet said file won't use include it any more after the currently deprecated functions are removed, so include attributes.h directly. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	4 years ago
Anton Khirnov	c8c2dfbc37	lavu: move LOCAL_ALIGNED from internal.h to mem_internal.h That is a more appropriate place for it.	4 years ago
Lauri Kasanen	6b5ea90eac	swscale/ppc: Add av_unused to template vars only used in one includer	6 years ago
Lauri Kasanen	ac3062f1a4	swscale/ppc: Clean up some mixed decl warnings	6 years ago
Lauri Kasanen	8522d219ce	libswscale/ppc: VSX-optimize 9-16 bit yuv2planeX ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p16be \ -s 1920x1728 -f null -vframes 100 -v error -nostats - 9-14 bit funcs get about 6x speedup, 16-bit gets about 15x. Fate passes, each format tested with an image to video conversion. Only POWER8 includes 32-bit vector multiplies, so POWER7 is locked out of the 16-bit function. This includes the vec_mulo/mule functions too, not just vmuluwm. With TIMER_REPORT skips disabled: yuv420p9le 12412 UNITS in planarX, 131072 runs, 0 skips 73136 UNITS in planarX, 131072 runs, 0 skips yuv420p9be 12481 UNITS in planarX, 131072 runs, 0 skips 73410 UNITS in planarX, 131072 runs, 0 skips yuv420p10le 12322 UNITS in planarX, 131072 runs, 0 skips 72546 UNITS in planarX, 131072 runs, 0 skips yuv420p10be 12291 UNITS in planarX, 131072 runs, 0 skips 72935 UNITS in planarX, 131072 runs, 0 skips yuv420p12le 12316 UNITS in planarX, 131072 runs, 0 skips 72708 UNITS in planarX, 131072 runs, 0 skips yuv420p12be 12319 UNITS in planarX, 131072 runs, 0 skips 72577 UNITS in planarX, 131072 runs, 0 skips yuv420p14le 12259 UNITS in planarX, 131072 runs, 0 skips 72516 UNITS in planarX, 131072 runs, 0 skips yuv420p14be 12440 UNITS in planarX, 131072 runs, 0 skips 72962 UNITS in planarX, 131072 runs, 0 skips yuv420p16le 10548 UNITS in planarX, 131072 runs, 0 skips 73429 UNITS in planarX, 131072 runs, 0 skips yuv420p16be 10634 UNITS in planarX, 131072 runs, 0 skips 150959 UNITS in planarX, 131072 runs, 0 skips Signed-off-by: Lauri Kasanen <cand@gmx.com>	6 years ago
Lauri Kasanen	78c7ff7d25	swscale/ppc: Move VSX-using code to its own file Passes fate on LE (with "lavc/jrevdct: Avoid an aliasing violation" applied). Signed-off-by: Lauri Kasanen <cand@gmx.com> Tested-by: Michael Kostylev on BE Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Lauri Kasanen	46c5693ea3	swscale/output: Altivec-optimize yuv2plane1_8 ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p \ -f null -vframes 100 -v error -nostats - 1158 UNITS in planar1, 65528 runs, 8 skips -cpuflags 0 19082 UNITS in planar1, 65533 runs, 3 skips 16.48 speedup ratio. On x86, SSE2 is ~7. Curiously, the Power C version takes as many cycles as the x86 SSE2 version, yikes it's fast. Note that this function uses VSX instructions, but is not marked so. This is because several existing functions also make that mistake. I'll submit a patch moving them once this is reviewed. Signed-off-by: Lauri Kasanen <cand@gmx.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Sergey Lavrushkin	582bc5a348	libswscale: Adds conversions from/to float gray format. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Michael Niedermayer	d736b52a04	swscale: Drop is9_OR_10BPS() use, its name is not correct Found-by: Luca Barbato Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	8 years ago
Michael Niedermayer	328ea6a9a5	swscale: Add input support for 12-bit formats Implemented for AV_PIX_FMT_GBRP12. Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	8 years ago
Luca Barbato	2b5b1e1e9b	swscale: Rename is9_OR_10 to match what it does It is used to select functions that work with 9-15bits.	8 years ago
Pedro Arthur	6de58b4903	swscale: cleanup unused code Removed previous swscale code under '#ifndef NEW_FILTER' and removed unused fields of SwsContext	9 years ago
Luca Barbato	da60b99a88	ppc: Restrict some Altivec implementations to Big Endian In Little Endian the vec_ld/vec_st operations work as expected only for byte-vectors.	10 years ago
Rong Yan	603c839398	swscale/ppc/swscale_altivec.c: POWER LE support in yuv2planeX_8() delete macro GET_VF() it was wrong GCC tool had a bug of PPC intrinsic interpret, which has been fixed in GCC 4.9.1. This bug lead to errors in two of our previous patches. We found this when we update our GCC tools to 4.9.1 and by reading the related info on GCC website. We fix our previous error in two separate commits Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Christophe Gisquet	5d38c628b0	ppc: libswscale: use LOCAL_ALIGNED instead of DECLARE_ALIGNED The later may yield incorrect code for on-stack variables. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Rong Yan	e74e14608f	libswscale/ppc/swscale_altivec.c : fix hScale_altivec_real() yuv2planeX_16_altivec() yuv2planeX_8() for little endian add marcos GET_LS() GET_VF() LOAD_FILTER() LOAD_L1() GET_VF4() FIRST_LOAD() UPDATE_PTR() LOAD_SRCV() LOAD_SRCV8() GET_VFD() for POWER LE Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Diego Biurrun	c2503d9c8a	swscale: ppc: Hide arch-specific initialization details Also give consistent names to init functions.	12 years ago
Anton Khirnov	716d413c13	Replace PIX_FMT_* -> AV_PIX_FMT_*, PixelFormat -> AVPixelFormat	12 years ago
Mans Rullgard	07eb7e20af	ppc: swscale: rework yuv2planeX_altivec() This gets rid of the variable-length scratch buffer by filtering 16 pixels at a time and writing directly to the destination. The extra loads this requires to load the source values are compensated by not doing a round-trip to memory before shifting. Signed-off-by: Mans Rullgard <mans@mansr.com>	12 years ago
Diego Biurrun	5a6e3c039c	swscale: Mark all init functions as av_cold	13 years ago
Michael Niedermayer	fa36f33422	sws: support 12&14 bit planar colorspaces Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Ronald S. Bultje	2254b559cb	swscale: make filterPos 32bit. Fixes overflows for large image sizes. Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind CC: libav-stable@libav.org	13 years ago
Diego Biurrun	04217de4d6	swscale: K&R formatting cosmetics for PowerPC code (part I/II)	13 years ago
Diego Biurrun	33ad8c3cab	cosmetics: Remove some unnecessary block braces.	13 years ago
Ronald S. Bultje	f48b12e0a6	swscale: update altivec yuv2planeX asm to new per-plane API.	13 years ago
Kieran Kunhya	ff7913aef1	Split up yuv2yuvX functions Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Mans Rullgard	d853e571ad	ppc: fix some pointer to integer casts Use uintptr_t instead of plain int. Without this change, the comparisons will come out wrong for pointers in certain ranges. Fixes random failures on ppc64. Also fixes some compiler warnings. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Ronald S. Bultje	3f04ab4fcd	swscale: split hScale() function pointer into h[cy]Scale(). This allows using more specific implementations for chroma/luma, e.g. we can make assumptions on filterSize being constant, thus avoiding that test at runtime.	14 years ago
Luca Barbato	3304a1e69a	swscale: add dithering to yuv2yuvX_altivec_real It just does that part in scalar form, I doubt using a vector store over 2 array would speed it up particularly. The function should be written to not use a scratch buffer.	14 years ago
Ronald S. Bultje	28c1115a91	swscale: use 15-bit intermediates for 9/10-bit scaling.	14 years ago
Ronald S. Bultje	948ccdadf4	swscale: for >8bit scaling, read in native bit-depth. For 9/10bit, it means we don't have to upscale to 16bit before actual scaling or pixel format conversion, and thus a performance gain.	14 years ago
Ronald S. Bultje	8a8d0ce208	swscale: for >8bit scaling, read in native bit-depth. For 9/10bit, it means we don't have to upscale to 16bit before actual scaling or pixel format conversion, and thus a performance gain.	14 years ago
Ronald S. Bultje	45f6ffe5e9	swscale: implement >8bit scaling support. This means that precision is retained when scaling between sample formats with >8 bits per component (48bit RGB, 16bit grayscale, 9/10/16bit YUV).	14 years ago
Ronald S. Bultje	ef1ee362b3	swscale: implement >8bit scaling support. This means that precision is retained when scaling between sample formats with >8 bits per component (48bit RGB, 16bit grayscale, 9/10/16bit YUV).	14 years ago
Mans Rullgard	635930d466	PPC: swscale: disable altivec functions for unsupported formats Signed-off-by: Mans Rullgard <mans@mansr.com>	14 years ago
Ronald S. Bultje	13a099799e	swscale: change prototypes of scaled YUV output functions. Remove unused variables "flags" and "dstFormat" in yuv2packed1, merge source rows per plane for yuv2packed[12], and make every source argument int16_t (some where invalidly set to uint16_t). This prevents stack pollution and is part of the Great Evil Plan to simplify swscale.	14 years ago
Ronald S. Bultje	dc179ec819	swscale: split yuv2packedX_altivec in smaller functions. This will likely lead to a considerable performance boost, since it removes a branch from the inner loop. Part of the Great Evil Plan to simplify swscale.	14 years ago
Ronald S. Bultje	97535ffb97	swscale: remove unused xInc/srcW arguments from hScale().	14 years ago
Ronald S. Bultje	ca364a5b43	swscale: extract SWS_FULL_CHR_H_INT conditional into init code.	14 years ago
Ronald S. Bultje	bda9b20fa4	swscale: un-special-case yuv2yuvX16_c(). Make yuv2yuvX16_c a function pointer for yuv2yuvX(), so that the function pointer becomes bitdepth-independent.	14 years ago
Ronald S. Bultje	075d0ae72c	swscale: enable hScale_altivec_real.	14 years ago
Ronald S. Bultje	67d80a5421	swscale: split out ppc _template.c files from main swscale.c.	14 years ago
Ronald S. Bultje	a3e9bb5dee	swscale: remove indirections in ppc/swscale_template.c.	14 years ago
Ronald S. Bultje	0e5d31b16b	swscale: split out unscaled altivec YUV converters in their own file.	14 years ago
Reimar Döffinger	54dc95634d	Cast pointers to uintptr_t rather than unsigned int. Avoids potential warnings on PPC64 systems.	14 years ago
Michael Niedermayer	986f0d86cb	Commits that could not be pulled earlier due to bugs. commit `93681fbd50` Author: Ronald S. Bultje <rsbultje@gmail.com> Date: Thu May 26 11:32:32 2011 -0400 swscale: fix compile on ppc. commit `e758573a88` Author: Ronald S. Bultje <rsbultje@gmail.com> Date: Thu May 26 10:36:47 2011 -0400 swscale: fix compile on x86-32. commit `0f4eb8b043` Author: Ronald S. Bultje <rsbultje@gmail.com> Date: Thu May 26 09:17:52 2011 -0400 swscale: remove VOF/VOFW. commit `b4a224c5e4` Author: Ronald S. Bultje <rsbultje@gmail.com> Date: Wed May 25 14:30:09 2011 -0400 swscale: split chroma buffers into separate U/V planes. Preparatory step to implement support for sizes > VOFW.	14 years ago
Anton Khirnov	b8e893399f	sws: replace all long with int. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	14 years ago
Ronald S. Bultje	93681fbd50	swscale: fix compile on ppc.	14 years ago

8 Commits (3824ee2fafe94e2089a1d9da9c719707e4a2a19e)