mbrtowc -- Convert Multibyte Character to Wide Character

Format

#include <wchar.h>
size_t mbrtowc (wchar_t *pwc, const char *string,
                size_t n, mbstate_t *ps);

Language Level: ANSI 93
mbrtowc is a restartable version of mbtowc, and performs the same function. It first determines the length of the multibyte character pointed to by string. It then converts the multibyte character to the corresponding wide character, and stores the converted character in the location pointed to by pwc, if pwc is not a null pointer. A maximum of n bytes are examined.

With mbrtowc, you can switch from one multibyte string to another. On systems that support shift states, ps represents the initial shift state of the string (0). If you read in only part of the string, mbrtowc sets ps to the string's shift state at the point you stopped. You can then call mbrtowc again for that string and pass in the updated ps value to continue reading where you left off.

Note: Because OS/2 and Windows code pages do not support shift states, the ps parameter is provided only for compatibility with other ANSI/ISO platforms. IBM C and C++ Compilers ignores the value passed for ps.

The behavior of mbrtowc is affected by the LC_CTYPE category of the current locale.

Return Value
If string is a null pointer, mbrtowc returns 0.

If string is not a null pointer, mbrtowc returns the first of the following that applies:

0
If the next n or fewer bytes complete the valid multibyte character that corresponds to the null wide character.
positive
If the next n or fewer bytes complete the valid multibyte character; the value returned is the number of bytes that complete the multibyte character.
-2
If the next n bytes form an incomplete (but potentially valid) multibyte character, and all n bytes have been processed.
-1
If an encoding error occurs (when the next n or fewer bytes do not form a complete and valid multibyte character). mbrtowc sets errno to EILSEQ.

Example
This example uses mbrlen to move to the second character in a string, then calls mbrtowc to convert the multibyte character to a wide character.

#include <wchar.h>
#include <stdio.h>
#include <stdlib.h>
#include <locale.h>
#define LOCNAME "ja_jp.ibm-932"
int main(void)
{
   char       mbs1[] = "abc";
   char       mbs2[] = "\x81\x41\x81\x42" "m";
   mbstate_t  ss1 = 0;
   mbstate_t  ss2 = 0;
   size_t     length1, length2;
   wchar_t    wc1, wc2;
   if (NULL == setlocale(LC_ALL, LOCNAME)) {
      printf("Locale \"%s\" could not be loaded\n", LOCNAME);
      exit(1);
   }
   length1 = mbrlen(mbs1, MB_CUR_MAX, &ss1);
   length2 = mbrlen(mbs2, MB_CUR_MAX, &ss2);
   mbrtowc(&wc1, mbs1 + length1, MB_CUR_MAX, &ss1);
   mbrtowc(&wc2, mbs2 + length2, MB_CUR_MAX, &ss2);
   printf("The second character in mbs1 is: <%1c>\n", wc1);
   printf("The second character in mbs2 is: <%1c>\n", wc2);
   return 0;
   /**********************************************************
      The output should be similar to:
      The second character in mbs1 is: <b>
      The second character in mbs2 is: < B>
   **********************************************************/
}


mblen -- Determine Length of Multibyte Character
mbrlen -- Calculate Length of Multibyte Character
mbsrtowcs -- Convert Multibyte String to Wide-Character String
setlocale -- Set Locale
wcrtomb -- Convert Wide Character to Multibyte Character
wcsrtombs -- Convert Wide-Character String to Multibyte String
<locale.h>
<wchar.h>