|
| ssize_t | vlc_towc (const char *str, uint32_t *restrict pwc) |
| | Decodes a code point from UTF-8.
|
| |
| static const char * | IsUTF8 (const char *str) |
| | Checks UTF-8 validity.
|
| |
| static const char * | IsASCII (const char *str) |
| | Checks ASCII validity.
|
| |
| static char * | EnsureUTF8 (char *str) |
| | Removes non-UTF-8 sequences.
|
| |
| int | utf8_vfprintf (FILE *stream, const char *fmt, va_list ap) |
| | Formats an UTF-8 string as vfprintf(), then print it, with appropriate conversion to local encoding.
|
| |
| int | utf8_fprintf (FILE *, const char *,...) |
| | Formats an UTF-8 string as fprintf(), then print it, with appropriate conversion to local encoding.
|
| |
| char * | vlc_strcasestr (const char *, const char *) |
| | Look for an UTF-8 string within another one in a case-insensitive fashion.
|
| |
| char * | FromCharset (const char *charset, const void *data, size_t data_size) |
| | Converts a string from the given character encoding to utf-8.
|
| |
| void * | ToCharset (const char *charset, const char *in, size_t *outsize) |
| | Converts a nul-terminated UTF-8 string to a given character encoding.
|
| |
| static char * | FromLatin1 (const char *latin) |
| | Converts a nul-terminated string from ISO-8859-1 to UTF-8.
|
| |
| static char * EnsureUTF8 |
( |
char * |
str | ) |
|
|
inlinestatic |
Removes non-UTF-8 sequences.
Replaces invalid or over-long UTF-8 bytes sequences within a null-terminated string with question marks. This is so that the string can be printed at least partially.
- Warning
- Do not use this were correctness is critical. use IsUTF8() and handle the error case instead. This function is mainly for display or debug.
- Note
- Converting from Latin-1 to UTF-8 in place is not possible (the string size would be increased). So it is not attempted even if it would otherwise be less disruptive.
- Return values
-
| str | the string is a valid null-terminated UTF-8 sequence (i.e. no changes were made) |
| NULL | the string is not an UTF-8 sequence |
References likely, and vlc_towc().
Referenced by filename_sanitize(), input_item_SetURI(), and InputMetaUser().
| char * vlc_strcasestr |
( |
const char * |
haystack, |
|
|
const char * |
needle |
|
) |
| |
Look for an UTF-8 string within another one in a case-insensitive fashion.
Beware that this is quite slow. Contrary to strcasestr(), this function works regardless of the system character encoding, and handles multibyte code points correctly.
- Parameters
-
| haystack | string to look into |
| needle | string to look for |
- Returns
- a pointer to the first occurrence of the needle within the haystack, or NULL if no occurrence were found.
References unlikely, and vlc_towc().
| ssize_t vlc_towc |
( |
const char * |
str, |
|
|
uint32_t *restrict |
pwc |
|
) |
| |
Decodes a code point from UTF-8.
Converts the first character in a UTF-8 sequence into a Unicode code point.
- Parameters
-
| str | an UTF-8 bytes sequence [IN] |
| pwc | address of a location to store the code point [OUT] |
- Returns
- the number of bytes occupied by the decoded code point
- Return values
-
| -1 | not a valid UTF-8 sequence |
| 0 | null character (i.e. str points to an empty string) |
| 1 | (non-null) ASCII character |
| 2-4 | non-ASCII character |
References likely, and unlikely.
Referenced by EnsureUTF8(), IsUTF8(), print_desc(), vlc_str2keycode(), vlc_strcasestr(), vlc_swidth(), and vlc_xml_encode().