Character transcoding functions

This page applies to Harlequin v13.1r0 and later; and to Harlequin MultiRIP but not Harlequin Core

The PFI provides the following character transcoding functions:

Convert a PlgFwTextString into Unicode

This function converts a PlgFwTextString of given length into Unicode and writes it into a pre‐allocated Unicode string. Any zero PlgFwTextByte in the input causes premature termination of the input string.

The caller must ensure that the pre‐allocated buffer is large enough to receive the output; it should have room for as many 16‐bit Unicode characters as there are bytes (not characters) in the input string, plus one for the Unicode terminator. The function has the following parameters:

ptb : a pointer to the input string.
cbInLength : the number of bytes (not characters) in the input string.
puc : a pointer to the pre‐allocated buffer for the output Unicode string.
cucAllocSize : the size of the output buffer in 16‐bit Unicode characters.
fAllowInvalid : determines the behavior when an invalid encoding is encountered. If FALSE , an invalid encoding is considered an error and zero returned. If TRUE , the Unicode “Unknown Character” (0xFFFD ) is substituted for each invalid character.

The function returns the number of Unicode characters written to puc , not including the zero terminator.

TEXT

    uint32 PlgFwUniFromNTextStringToBuffer(
                  int32 fAllowInvalid, PlgFwTextByte * ptb, uint32 cbInLength, PlgFwUniChar * puc, uint32 cucAllocSize);

Convert a Unicode string into a PlgFwStrRecord

This function converts a UTF‐16 string (puc) of given length (cucInLength , specified in Unicode char‐ acters) into a PlgFwStrRecord (pRecord) . The function returns the number of bytes inserted.

Any zero Unicode character will cause premature termination of the input.

TEXT

    uint32 PlgFwStrNPutUnicode(
                  PlgFwStrRecord * pRecord, PlgFwUniChar * puc, uint32 cucInLength )

Convert a PlgFwTextString string into an unterminated UTF-8 buffer

This function converts a PlgFwTextString pointed to by ptb into UTF‐8 format and writes it to a non-zero terminated UTF‐8 buffer (putf8 ), the length of which is specified, in bytes, by cbAllocLength .

The function returns the number of bytes written to the output buffer.

Note: In the current implementation, fAllowInvalid is ignored instead the function fails, returning

0 , if an invalid encoding is encountered.

TEXT

    uint32 PlgFwStrToUTF8Buffer(
                  int32 fAllowInvalid, PlgFwTextByte * ptb, uint32 cbInLength, PlgFwUTF8Char * putf8, uint32 cbAllocLength);

Convert a UTF-8 string of given length (in bytes) into a PlgFwStrRecord.

This function converts a given length of a UTF‐8 string (pointed to by putf8 ) and copies it into a PlgFwStrRecord (pRecord) , allowing for a BOM (Byte order mark) at the start of the UTF‐8 string. Since the length to convert (cbInLength) is specified in bytes, not characters, be careful not to supply a length which would terminate it in the middle of an of a UTF‐8 encoded character: this could corrupt a message and possibly lead to other, undesirable, behavior.

Any zero character in putf8 will cause premature termination of the input. The function returns the number of bytes inserted into pRecord .

TEXT

    uint32 PlgFwStrNPutUTF8(
                  PlgFwStrRecord * pRecord, PlgFwUTF8Char * putf8, uint32 cbInLength);

Convert a UTF-16 buffer to a non-zero terminated UTF-8 buffer.

This function converts a given length of a UTF‐16 buffer (pointed to by puc ) into a non-zero terminated UTF‐8 buffer (putf8 ). cbAllocLength specifies the length, in bytes, of the destination buffer.

The function returns the number of bytes written to the output buffer.

TEXT

    uint32 PlgFwUniToUTF8Buffer(
                  PlgFwUniChar * puc, uint32 cucInLength, PlgFwUTF8Char * putf8, uint32 cbAllocLength);

Convert a UTF-8 buffer into an UTF-16 buffer

This function converts a given length of a UTF‐8 buffer (putf8 ) to a UTF‐16 buffer (puc) , allowing for a Byte Order Mark (BOM) at the start of the UTF‐8 string. cucAllocSize specifies the length, in Uni‐ code chars, of the destination buffer.

The function returns the number of Unicode chars written to the output buffer.

TEXT

    uint32 PlgFwUniFromNUTF8ToBuffer(
                  PlgFwUTF8Char * putf8, uint32 cbInLength, PlgFwUniChar * puc, uint32 cucAllocSize);