mirror of https://github.com/FFmpeg/FFmpeg.git
Originally committed as revision 15316 to svn://svn.ffmpeg.org/ffmpeg/trunkpull/126/head
parent
70735a3f9e
commit
38d174b375
1 changed files with 98 additions and 0 deletions
@ -0,0 +1,98 @@ |
||||
The official guide to swscale for confused developers. |
||||
======================================================== |
||||
|
||||
Current (simplified) Architecture: |
||||
--------------------------------- |
||||
Input |
||||
v |
||||
_______OR_________ |
||||
/ \ |
||||
/ \ |
||||
special converter [Input to YUV converter] |
||||
| | |
||||
| (8bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:0:0 ) |
||||
| | |
||||
| v |
||||
| Horizontal scaler |
||||
| | |
||||
| (15bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:1:1 / 4:0:0 ) |
||||
| | |
||||
| v |
||||
| Vertical scaler and output converter |
||||
| | |
||||
v v |
||||
output |
||||
|
||||
|
||||
Swscale has 2 scaler pathes, each side must be capable to handle |
||||
slices, that is consecutive non overlapping rectangles of dimension |
||||
(0,slice_top) - (picture_width, slice_bottom) |
||||
|
||||
special converter |
||||
This generally are unscaled converters of common |
||||
formats, like YUV 4:2:0/4:2:2 -> RGB15/16/24/32. Though it could also |
||||
in principle contain scalers optimized for specific common cases. |
||||
|
||||
Main path |
||||
The main path is used when no special converter can be used, the code |
||||
is designed as a destination line pull architecture. That is for each |
||||
output line the vertical scaler pulls lines from a ring buffer that |
||||
when the line is unavailable pulls it from the horizontal scaler and |
||||
input converter of the current slice. |
||||
When no more output can be generated as lines from a next slice would |
||||
be needed then all remaining lines in the current slice are converted |
||||
and horizontally scaled and put in the ring buffer. |
||||
[this is done for luma and chroma, each with possibly different numbers |
||||
of lines per picture] |
||||
|
||||
Input to YUV Converter |
||||
When the input to the main path is not planar 8bit per component yuv or |
||||
8bit gray then it is converted to planar 8bit YUV, 2 sets of converters |
||||
exist for this currently one performing horizontal downscaling by 2 |
||||
before the convertion and the other leaving the full chroma resolution |
||||
but being slightly slower. The scaler will try to preserve full chroma |
||||
here when the output uses it, its possible to force full chroma with |
||||
SWS_FULL_CHR_H_INP though even for cases where the scaler thinks its |
||||
useless. |
||||
|
||||
Horizontal scaler |
||||
There are several horizontal scalers, a special case worth mentioning is |
||||
the fast bilinear scaler that is made of runtime generated mmx2 code |
||||
using specially tuned pshufw instructions. |
||||
The remaining scalers are specially tuned for various filter lengths |
||||
they scale 8bit unsigned planar data to 16bit signed planar data. |
||||
Future >8bit per component inputs will need to add a new scaler here |
||||
that preserves the input precission. |
||||
|
||||
Vertical scaler and output converter |
||||
There is a large number of combined vertical scalers+output converters |
||||
Some are: |
||||
* unscaled output converters |
||||
* unscaled output converters that average 2 chroma lines |
||||
* bilinear converters (C, MMX and accurate MMX) |
||||
* arbitrary filter length converters (C, MMX and accurate MMX) |
||||
And |
||||
* Plain C 8bit 4:2:2 YUV -> RGB converters using LUTs |
||||
* Plain C 17bit 4:4:4 YUV -> RGB converters using multiplies |
||||
* MMX 11bit 4:2:2 YUV -> RGB converters |
||||
* Plain C 16bit Y -> 16bit gray |
||||
... |
||||
|
||||
RGB with less than 8bit per component uses dither to improve the |
||||
subjective quality and low frequency accuracy. |
||||
|
||||
|
||||
Filter coefficients: |
||||
-------------------- |
||||
There are several different scalers (bilinear, bicubic, lanczos, area, sinc, ...) |
||||
Their coefficients are calculated in initFilter(). |
||||
Horinzontal filter coeffs have a 1.0 point at 1<<14, vertical ones at 1<<12. |
||||
The 1.0 points have been choosen to maximize precission while leaving a |
||||
little headroom for convolutional filters like sharpening filters and |
||||
minimizing SIMD instructions needed to apply them. |
||||
It would be trivial to use a different 1.0 point if some specific scaler |
||||
would benefit from it. |
||||
Also as already hinted at initFilter() accepts an optional convolutional |
||||
filter as input that can be used for contrast, saturation, blur, sharpening |
||||
shift, chroma vs. luma shift, ... |
||||
|
Loading…
Reference in new issue