|
ssize_t | vlc_towc (const char *str, uint32_t *restrict pwc) |
| Decodes a code point from UTF-8.
|
|
static const char * | IsUTF8 (const char *str) |
| Checks UTF-8 validity.
|
|
static const char * | IsASCII (const char *str) |
| Checks ASCII validity.
|
|
static char * | EnsureUTF8 (char *str) |
| Removes non-UTF-8 sequences.
|
|
int | utf8_vfprintf (FILE *stream, const char *fmt, va_list ap) |
| Formats an UTF-8 string as vfprintf(), then print it, with appropriate conversion to local encoding.
|
|
int | utf8_fprintf (FILE *, const char *,...) |
| Formats an UTF-8 string as fprintf(), then print it, with appropriate conversion to local encoding.
|
|
char * | vlc_strcasestr (const char *, const char *) |
| Look for an UTF-8 string within another one in a case-insensitive fashion.
|
|
char * | FromCharset (const char *charset, const void *data, size_t data_size) |
| Converts a string from the given character encoding to utf-8.
|
|
void * | ToCharset (const char *charset, const char *in, size_t *outsize) |
| Converts a nul-terminated UTF-8 string to a given character encoding.
|
|
static char * | FromLatin1 (const char *latin) |
| Converts a nul-terminated string from ISO-8859-1 to UTF-8.
|
|
static char * EnsureUTF8 |
( |
char * |
str | ) |
|
|
inlinestatic |
Removes non-UTF-8 sequences.
Replaces invalid or over-long UTF-8 bytes sequences within a null-terminated string with question marks. This is so that the string can be printed at least partially.
- Warning
- Do not use this were correctness is critical. use IsUTF8() and handle the error case instead. This function is mainly for display or debug.
- Note
- Converting from Latin-1 to UTF-8 in place is not possible (the string size would be increased). So it is not attempted even if it would otherwise be less disruptive.
- Return values
-
str | the string is a valid null-terminated UTF-8 sequence (i.e. no changes were made) |
NULL | the string is not an UTF-8 sequence |
References likely, and vlc_towc().
Referenced by filename_sanitize(), input_item_SetURI(), and InputMetaUser().
char * vlc_strcasestr |
( |
const char * |
haystack, |
|
|
const char * |
needle |
|
) |
| |
Look for an UTF-8 string within another one in a case-insensitive fashion.
Beware that this is quite slow. Contrary to strcasestr(), this function works regardless of the system character encoding, and handles multibyte code points correctly.
- Parameters
-
haystack | string to look into |
needle | string to look for |
- Returns
- a pointer to the first occurrence of the needle within the haystack, or NULL if no occurrence were found.
References unlikely, and vlc_towc().
ssize_t vlc_towc |
( |
const char * |
str, |
|
|
uint32_t *restrict |
pwc |
|
) |
| |
Decodes a code point from UTF-8.
Converts the first character in a UTF-8 sequence into a Unicode code point.
- Parameters
-
str | an UTF-8 bytes sequence [IN] |
pwc | address of a location to store the code point [OUT] |
- Returns
- the number of bytes occupied by the decoded code point
- Return values
-
-1 | not a valid UTF-8 sequence |
0 | null character (i.e. str points to an empty string) |
1 | (non-null) ASCII character |
2-4 | non-ASCII character |
References likely, and unlikely.
Referenced by EnsureUTF8(), IsUTF8(), print_desc(), vlc_str2keycode(), vlc_strcasestr(), vlc_swidth(), and vlc_xml_encode().