FFmpeg

Commit Graph

Author	SHA1	Message	Date
Peter Collingbourne	9bcb1cb6ed	Add assembly support for -fsanitize=hwaddress tagged globals. As of LLVM r368102, Clang will set a pointer tag in bits 56-63 of the address of a global when compiling with -fsanitize=hwaddress. This requires an adjustment to assembly code that takes the address of such globals: the code cannot use the regular R_AARCH64_ADR_PREL_PG_HI21 relocation to refer to the global, since the tag would take the address out of range. Instead, the code must use the non-checking (_NC) variant of the relocation (the link-time check is substituted by a runtime check). This change makes the necessary adjustment in the movrel macro, where it is needed when compiling with -fsanitize=hwaddress. Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Martin Storsjö Reviewed-by: Janne Grunau	5 years ago
Shiyou Yin	e1039b09c4	avutil/mips: remove redundant code in TRANSPOSE16x8_UB_UB. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	5 years ago
gxw	a3e572d96f	avutil/mips: refine msa macros CLIP_*. Changing details as following: 1. Remove the local variable 'out_m' in 'CLIP_SH' and store the result in source vector. 2. Refine the implementation of macro 'CLIP_SH_0_255' and 'CLIP_SW_0_255'. Performance of VP8 decoding has speed up about 1.1%(from 7.03x to 7.11x). Performance of H264 decoding has speed up about 0.5%(from 4.35x to 4.37x). Performance of Theora decoding has speed up about 0.7%(from 5.79x to 5.83x). 3. Remove redundant macro 'CLIP_SH/Wn_0_255_MAX_SATU' and use 'CLIP_SH/Wn_0_255' instead, because there are no difference in the effect of this two macros. Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	5 years ago
Shiyou Yin	11f99a9a45	avutil/mips: Avoid instruction exception caused by gssqc1/gslqc1. Ensure the address accesed by gssqc1/gslqc1 are 16-byte aligned.	5 years ago
Lynne	42e2319ba9	lavu/tx: add support for double precision FFT and MDCT Simply moves and templates the actual transforms to support an additional data type. Unlike the float version, which is equal or better than libfftw3f, double precision output is bit identical with libfftw3.	5 years ago
Linjie Fu	b3b7523feb	lavu/hwcontext_qsv: fix the memory leak av_dict_free child_device_opts to fix the memory leak. Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Zhong Li <zhong.li@intel.com>	5 years ago
Michael Niedermayer	80bb65fafa	Bump minor versions again on master to keep 4.2 versions separate from master Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Michael Niedermayer	22db337a40	Bump minor versions to separate 4.2 from master Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Michael Niedermayer	82e389d066	avutil/softfloat_ieee754: Fix odd bit position for exponent and sign in av_bits2sf_ieee754() Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Shiyou Yin	153c607525	avutil/mips: refactor msa load and store macros. Replace STnxm_UB and LDnxm_SH with new macros ST_{H/W/D}{1/2/4/8}. The old macros are difficult to use because they don't follow the same parameter passing rules. Changing details as following: 1. remove LD4x4_SH. 2. replace ST2x4_UB with ST_H4. 3. replace ST4x2_UB with ST_W2. 4. replace ST4x4_UB with ST_W4. 5. replace ST4x8_UB with ST_W8. 6. replace ST6x4_UB with ST_W2 and ST_H2. 7. replace ST8x1_UB with ST_D1. 8. replace ST8x2_UB with ST_D2. 9. replace ST8x4_UB with ST_D4. 10. replace ST8x8_UB with ST_D8. 11. replace ST12x4_UB with ST_D4 and ST_W4. Examples of new macro: ST_H4(in, idx0, idx1, idx2, idx3, pdst, stride) ST_H4 store four half-word elements in vector 'in' to pdst with stride. About the macro name: 1) 'ST' means store operation. 2) 'H/W/D' means type of vector element is 'half-word/word/double-word'. 3) Number '1/2/4/8' means how many elements will be stored. About the macro parameter: 1) 'in0, in1...' 128-bits vector. 2) 'idx0, idx1...' elements index. 3) 'pdst' destination pointer to store to 4) 'stride' stride of each store operation. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Steven Liu	1498e39439	avutil/hwcontext_vaapi: move kernel_driver into CONFIG_LIBDRM Reviewed-by: Zhong Li <zhong.li@intel.com> Signed-off-by: Steven Liu <lq@onvideo.cn>	6 years ago
Shiyou Yin	a45e8ade2d	avutil/mips: optimize UNPCK&SAD macros with MSA2.0 instruction. Loongson 3A4000 and 2k1000 has supported MSA2.0. This patch optimized SAD_UB2_UH,UNPCK_R_SH_SW,UNPCK_SB_SH and UNPCK_SH_SW with MSA2.0 instruction. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Mark Thompson	451a51124d	lavu/frame: Improve ROI documentation Clarify and add examples for the behaviour of the quantisation offset, and define how multiple ranges should be handled.	6 years ago
Amir Pauker	a30e44098a	avutil: add FF_DECODE_ERROR_DECODE_SLICES for AVFrame.decode_error_flags Signed-off-by: Amir Pauker <amir@livelyvideo.tv> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Amir Pauker	edfced8c04	avutil: add FF_DECODE_ERROR_CONCEALMENT_ACTIVE flag for AVFrame.decode_error_flags FF_DECODE_ERROR_CONCEALMENT_ACTIVE is set when the decoded frame has error(s) but the returned value from avcodec_receive_frame is zero i.e. concealed errors Signed-off-by: Amir Pauker <amir@livelyvideo.tv> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Mark Thompson	468f003843	hwcontext_qsv: Try to select a matching VAAPI device by default Tries to find a device backed by the i915 kernel driver and loads the iHD VAAPI driver to use with it. This reduces confusion on machines with multiple DRM devices and removes the surprising requirement to set the LIBVA_DRIVER_NAME environment variable to use libmfx at all.	6 years ago
Mark Thompson	0b4696fbe8	hwcontext_vaapi: Try to create devices via DRM before X11 Opening the device via X11 (DRI2/DRI3) rather than opening a DRM render node directly is only useful if you intend to use the legacy X11 interop functions. That's never true for the ffmpeg utility, and a library user who does want this will likely provide their own display instance rather than making a new one here.	6 years ago
Mark Thompson	7f3f5a24a1	hwcontext_vaapi: Add option to set driver name For example: -init_hw_device vaapi:/dev/dri/renderD128,driver=foo This may be more convenient that using the environment variable, and allows loading different drivers for different devices in the same process.	6 years ago
Mark Thompson	6b6b8a6371	hwcontext_vaapi: Make default DRM device selection more helpful Iterate over available render devices and pick the first one which looks usable. Adds an option to specify the name of the kernel driver associated with the desired device, so that it is possible to select a specific type of device in a multiple-device system without knowing the card numbering. For example: -init_hw_device vaapi:,kernel_driver=amdgpu will select only devices using the "amdgpu" driver (as used with recent AMD graphics cards). Kernel driver selection requires libdrm to work.	6 years ago
Mark Thompson	d2141a9b65	hwcontext_vaapi: Add option to specify connection type Can be set to "drm" or "x11" to force a specific connection type.	6 years ago
Steven Liu	76ef18fd39	avutil/dynarry.h: fix comment grammar mistakes of FF_DYNARRAY_ADD Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Steven Liu <lq@chinaffmpeg.org>	6 years ago
Ruiling Song	65646db8e8	avutil/tx: should check against (*ctx) ctx is a pointer to pointer here. Signed-off-by: Ruiling Song <ruiling.song@intel.com>	6 years ago
Lynne	6044534964	avutil/tx: fix forward compound non-mod-15 based MDCTs There was a hardcoded value left. Wasn't caught earlier as no code uses compound forward mod-3/5 MDCTs yet.	6 years ago
Lynne	87ee9d580c	lavu: bump minor and update APIchanges for the new transform API	6 years ago
Lynne	b79b29ddb1	libavutil: add an FFT & MDCT implementation This commit adds a new API to libavutil to allow for arbitrary transformations on various types of data. This is a partly new implementation, with the power of two transforms taken from libavcodec/fft_template, the 5 and 15-point FFT taken from mdct15, while the 3-point FFT was written from scratch. The (i)mdct folding code is taken from mdct15 as well, as the mdct_template code was somewhat old, messy and not easy to separate. A notable feature of this implementation is that it allows for 3xM and 5xM based transforms, where M is a power of two, e.g. 384, 640, 768, 1280, etc. AC-4 uses 3xM transforms while Siren uses 5xM transforms, so the code will allow for decoding of such streams. A non-exaustive list of supported sizes: 4, 8, 12, 16, 20, 24, 32, 40, 48, 60, 64, 80, 96, 120, 128, 160, 192, 240, 256, 320, 384, 480, 512, 640, 768, 960, 1024, 1280, 1536, 1920, 2048, 2560... The API was designed such that it allows for not only 1D transforms but also 2D transforms of certain block sizes. This was partly on accident as the stride argument is required for Opus MDCTs, but can be used in the context of a 2D transform as well. Also, various data types would be implemented eventually as well, such as "double" and "int32_t". Some performance comparisons with libfftw3f (SIMD disabled for both): 120: 22353 decicycles in fftwf_execute, 1024 runs, 0 skips 21836 decicycles in compound_fft_15x8, 1024 runs, 0 skips 128: 22003 decicycles in fftwf_execute, 1024 runs, 0 skips 23132 decicycles in monolithic_fft_ptwo, 1024 runs, 0 skips 384: 75939 decicycles in fftwf_execute, 1024 runs, 0 skips 73973 decicycles in compound_fft_3x128, 1024 runs, 0 skips 640: 104354 decicycles in fftwf_execute, 1024 runs, 0 skips 149518 decicycles in compound_fft_5x128, 1024 runs, 0 skips 768: 109323 decicycles in fftwf_execute, 1024 runs, 0 skips 164096 decicycles in compound_fft_3x256, 1024 runs, 0 skips 960: 186210 decicycles in fftwf_execute, 1024 runs, 0 skips 215256 decicycles in compound_fft_15x64, 1024 runs, 0 skips 1024: 163464 decicycles in fftwf_execute, 1024 runs, 0 skips 199686 decicycles in monolithic_fft_ptwo, 1024 runs, 0 skips With SIMD we should be faster than fftw for 15xM transforms as our fft15 SIMD is around 2x faster than theirs, even if our ptwo SIMD is slightly slower. The goal is to remove the libavcodec/mdct15 code and deprecate the libavcodec/avfft interface once aarch64 and x86 SIMD code has been ported. New code throughout the project should use this API. The implementation passes fate when used in Opus, AAC and Vorbis, and the output is identical with ATRAC9 as well.	6 years ago
Philip Langdale	5de4f1d871	avutil: Add NV24 and NV42 pixel formats These are the 4:4:4 variants of the semi-planar NV12/NV21 formats. These formats are not used much, so we've never had a reason to add them until now. VDPAU recently added support HEVC 4:4:4 content and when you use the OpenGL interop, the returned surfaces are in NV24 format, so we need the pixel format for media players, even if there's no direct use within ffmpeg. Separately, there are apparently webcams that use NV24, but I've never seen one.	6 years ago
ManojGuptaBonda	d617d54efa	avutil/hwcontext_vdpau: Map 444 pix fmts to new VdpYCbCr types New VdpYCbCr Formats VDP_YCBCR_FORMAT_Y_U_V_444 and, VDP_YCBCR_FORMAT_Y_UV_444 have been added in VDPAU with libvdpau-1.2 to be used in get/putbits for YUV 4:4:4 surfaces. Earlier mapping of AV_PIX_FMT_YUV444P to VDP_YCBCR_FORMAT_YV12 is not valid. Hence this Change maps AV_PIX_FMT_YUV444P to VDP_YCBCR_FORMAT_Y_U_V_444 to access the YUV 4:4:4 surface via read-back API's of VDPAU.	6 years ago
Linjie Fu	2d81acaa1a	lavu/hwcontext_qsv: Fix the realign check for hwupload Fix the aligned check in hwupload, input surface should be 16 aligned too. Partly fix #7830. Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Zhong Li <zhong.li@intel.com>	6 years ago
Michael Niedermayer	6f0e9a8634	avutil/avstring: Fix bug and undefined behavior in av_strncasecmp() The function in case of n=0 would read more bytes than 0. The end pointer could be beyond the allocated space, which is undefined. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Carl Eugen Hoyos	a24a1523e8	lavu/hwcontext_d3d: Cast src pointers calling av_image_copy*(). Silences several warnings: libavutil/hwcontext_d3d11va.c:413:49: warning: passing argument 3 of ‘av_image_copy’ from incompatible pointer type libavutil/hwcontext_d3d11va.c:425:47: warning: passing argument 3 of ‘av_image_copy’ from incompatible pointer type libavutil/hwcontext_dxva2.c:351:45: warning: passing argument 3 of ‘av_image_copy’ from incompatible pointer type libavutil/hwcontext_dxva2.c:382:52: warning: passing argument 3 of ‘av_image_copy_uc_from’ from incompatible pointer type	6 years ago
Gyan Doshi	3bef1dab6e	avutil/colorspace: add macros for RGB->YUV BT.709	6 years ago
Carl Eugen Hoyos	5ba769214f	lavu/hwcontext_qsv: Mark a pointer as const. Silences a warning: libavutil/hwcontext_qsv.c:912:15: warning: assignment discards 'const' qualifier from pointer target type	6 years ago
Martin Storsjö	c4642788e8	time_internal: Prefix fallback versions of gmtime_r/localtime_r with ff_ Use a macro to redirect calling code from the official name to the ff_ prefixed one. Detecting these functions in configure can be tricky (on mingw, they are conditionally available depending on posix feature defines). If configure didn't detect them, but they still are visible at compile time (due to an unrelated header defining the posix feature defines), providing the local fallback versions with a prefixed name is safer. Signed-off-by: Martin Storsjö <martin@martin.st>	6 years ago
Michael Niedermayer	9485cce6d5	time_internal: Do not attempt to override *time_r() macros In case these already are defined as macros, we shouldn't try to redefine them. Signed-off-by: Martin Storsjö <martin@martin.st>	6 years ago
fumoboy007	036b4b0f85	avcodec/videotoolbox: add support for 10bit pixel format this patch was originally posted on issue #7704 and was slightly adjusted to check for the availability of the pixel format.	6 years ago
Jarek Samic	1c50d61a5a	libavutil/hwcontext_opencl: Fix channel order in format support check The `opencl_get_plane_format` function was incorrectly determining the value used to set the image channel order. This resulted in all RGB pixel formats being set to the `CL_RGBA` pixel format, regardless of whether or not they actually were RGBA. This patch fixes the issue by using the `offset` and depth of components rather than the loop index to determine the value of `order`. Signed-off-by: Jarek Samic <cldfire3@gmail.com> Signed-off-by: Mark Thompson <sw@jkqxz.net>	6 years ago
Philip Langdale	52d8f35b14	avutil/hcontext_cuda: Remove unnecessary stream synchronisation Similarly to the previous changes, we don't need to synchronise after a memcpy to device memory. On the other hand, we need to keep synchronising after a copy to host memory, otherwise there's no guarantee that subsequent host reads will return valid data.	6 years ago
Ruiling Song	61cb505d18	lavu/opencl: replace va_ext.h with standard name Khronos OpenCL header (https://github.com/KhronosGroup/OpenCL-Headers) uses cl_va_api_media_sharing_intel.h. And Intel's official OpenCL driver for Intel GPU (https://github.com/intel/compute-runtime) was compiled against Khronos OpenCL header. So it's better to align with Khronos. Signed-off-by: Ruiling Song <ruiling.song@intel.com>	6 years ago
Zhong Li	15d016be30	lavu/qsv: allow surface size larger than requirement Just like commit `6829a07944`, surface size larger than requirement should not be treated as error. Signed-off-by: Zhong Li <zhong.li@intel.com>	6 years ago
gxw	4571c7c05d	avcodec/mips: [loongson] mmi optimizations for VP9 put and avg functions VP9 decoding speed improved about 60.5%(from 38fps to 61fps, tested on loongson 3A3000). Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Philip Langdale	96d79ff5b5	avutil/cuda_check: Fix non-dynamic-loader implementation The function typedefs we were using are only present when using the dynamic loader, which means compilation breaks for code directly using the cuda SDK. To fix this, let's just duplicate the function typedefs locally. These are not going to change.	6 years ago
Timo Rothenpieler	15c6390139	avutil/cuda_check: avoid pointlessly exporting same symbol from two libraries	6 years ago
Carl Eugen Hoyos	0cac68bcf9	lavu/parseutils: Allow to parse >= 100 hours. Reported and tested by gamnark. Fixes ticket #7721.	6 years ago
Lauri Kasanen	fc6022e108	avutil/ppc/cpu: Fix power8 linux detection The existing code was in no released kernel that I can see. The corrected code was added in 3.9.	6 years ago
Carl Eugen Hoyos	73d4efc596	lavu/imgutils: Use FFABS() instead of abs() for ptrdiff_t. Fixes a warning with clang: libavutil/imgutils.c:314:16: warning: absolute value function 'abs' given an argument of type 'ptrdiff_t' (aka 'long') but has parameter of type 'int' which may cause truncation of value	6 years ago
Martin Storsjö	41cf3e3b1c	arm: Create proper .rdata sections for COFF As .rodata isn't one of the default created sections for COFF, it was created as a read-write data section. By using the default .rdata section name for COFF, it automatically becomes a read-only data section. The existing ".section .rodata" works as intended for ELF though. This is based on an original patch and diagnose by Tom Tan <Tom.Tan@microsoft.com>. Signed-off-by: Martin Storsjö <martin@martin.st>	6 years ago
Shiyou Yin	6d19164811	avcodec/mips: [loongson] optimize put_hevc_qpel_hv_8 with mmi. Optimize put_hevc_qpel_hv_8 with mmi in the case width=4/8/12/16/24/32/48/64. This optimization improved HEVC decoding performance 11%(1.81x to 2.01x, tested on loongson 3A3000). Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Michael Niedermayer	f64c0dffa1	avutil/imgutils: Optimize memset_bytes() by using av_memcpy_backptr() This is strongly based on code by Marton Balint, and depends on the previous commit Fixes: Timeout Fixes: 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920 Before: Executed clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920 in 11209 ms After: Executed clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920 in 4104 ms Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Reviewed-by: Marton Balint <cus@passwd.hu> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Michael Niedermayer	12b1338be3	avutil/mem: Optimize fill32() by unrolling and using 64bit Reviewed-by: Marton Balint <cus@passwd.hu> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	6 years ago
Carl Eugen Hoyos	399c8e860f	lavu/frame: Fix typo.	6 years ago

1 2 3 4 5 ...

4941 Commits (493438fafc5c43b7b7c62bf0c21b7cc884034ce9)