In hb_ot_tag_from_language(), if first component of an unknown
language is three letters long, use it directly as OpenType language
tag (after case conversion and padding).
Still not sure about:
1) Case. We pass lowercase for now. Would be nice if graphite was
uppercase 3letter like OpenType,
2) Padding. IMO, tag padding is always with spaces, but Martin was
talking about NUL bytes.
Can be -1 for NUL-terminated string. This is useful for passing parts
of a larger string to a function without having to copy or modify the
string first.
Affected functions:
hb_tag_t hb_tag_from_string()
hb_direction_from_string()
hb_language_from_string()
hb_script_from_string()
As reported by Khaled on the list:
"After the introduction of canonical reordering of combining marks
(commit 34c22f8), I'm no longer able to do mark/mark substitution or
positioning for mark sequences that involve shadda as a first mark (or
most interesting sequences at least).
"After some digging, it turned out that shadda have a ccc=33 while most
Arabic marks that combine with it have a lower ccc value, which results
in the shadda being reordered after the other mark which,
unsurprisingly, breaks my contextual substitution and mkmk anchors."
See:
http://unicode.org/faq/normalization.html#8http://unicode.org/faq/normalization.html#9
For two reasons:
1. User can always call hb_buffer_pre_allocate() themselves, and
2. Now we do a pre_alloc in add_utfX anyway, so the total number of
reallocs is limited to a small number (~3) anyway. This just makes the
API cleaner.