Plugin SDK String Reference

From Real Software Documentation

Jump to: navigation, search

Contents

REALBuildString

Allocates and copies "length" from "contents" into a REALbasic string object. If "contents" is NULL, then the returned string will have "length" bytes allocated for storage. This function does not assign an encoding, so the returned string object will have a nil encoding. To assign an encoding, either use REALSetStringEncoding, or call REALBuildStringWithEncoding. When you are done with the string, you must call REALUnlockString to free the memory allocated by this call.

REALstring REALBuildString(const char *contents, int length);

  • contents: a buffer of data to copy into the string object. This parameter can be NULL, and it can contain non-ASCII data.
  • length: the number of bytes (not characters) to allocate in the string object. If contents is non-NULL, this parameter also specifies the number of bytes to copy from contents into the string object.

Return Value The newly created REALstring object, or NULL on failure


REALBuildStringWithEncoding

Allocates and copies "length" from "contents" into a REALbasic string object and then assigns "encoding" to the object. If "contents" is NULL, then the returned string will have "length" bytes allocated for storage. For more information about text encodings, see the entry for Text Encodings. When you are done with the string, you must call REALUnlockString to free the memory allocated by this call.

REALstring REALBuildStringWithEncoding( const char *contents, int byteCount, unsigned long encoding );

  • contents: a buffer of data to copy into the string object. This parameter can be NULL, and it can contain non-ASCII data.
  • length: the number of bytes (not characters) to allocate in the string object. If contents is non-NULL, this parameter also specifies the number of bytes to copy from contents into the string object.
  • encoding: the text encoding value to assign to the returned string object.

Return Value the newly created REALstring object, or NULL on failure


REALGetStringContents

Retrieves a pointer to a REALbasic string object's internal data buffer. In most cases, the return value from this function should be treated as a const void * (an immutable buffer of bytes). The only case where you can validly assume you can modify the contents of this buffer is if the "str" parameter was created by calling REALBuildString (or REALBuildStringWithEncoding) and passed in a NULL contents buffer. This function will optionally return the number of bytes which are valid in the returned buffer. You must look at the REALstring object's encoing to properly determine the format of the data returned via this function. For instance, if the string's encoding is ASCII, then this function will return a "CString" (char *) of the data. Or, if the string's encoding is UTF-16, then this function will return a "WString" (wchar_t *) of the data. You can assume that the buffer returned from this function will always have a null terminator based on the string's encoding.

void *REALGetStringContents( REALstring str, size_t *numBytes );

  • str: the REALstring object to query
  • numBytes: (optional, out) the number of bytes which are valid in the returned buffer, not including the terminating null byte(s). This parameter can be NULL.

Return Value a pointer to the string object's internal data, or NULL on failure. See remarks for details about this returned pointer.


REALGetStringEncoding

Retrieves the text encoding value associated with the given string. If a string does not have a text encoding value associated with it (commonly referred to as a nil encoding), then this function will return kREALTextEncodingUnknown (0xFFFF). For more information about possible return values, please see the entry for Text Encodings.

unsigned long REALGetStringEncoding(REALstring str);

  • str: the REALstring object to query

Return Value an integer value corresponding to the string's Text Encoding.


REALSetStringEncoding

Assigns the text encoding value with the given string object. This is akin to calling DefineEncoding in REALbasic code. For more information about encoding values, please see the entry for Text Encodings.

void REALSetStringEncoding(REALstring str, unsigned long encoding);

  • str: the REALstring object to assign the encoding to
  • encoding: the encoding value to assign to the string object

Return Value none


REALConvertString

Converts the given string from its current encoding, to the new one specified by the "encoding" parameter (akin to the ConvertEncoding REALbasic function). Calling this function will allocate and return an entirely new string, which you should free with a call to REALUnlockString. For more information about encoding values, please see the entry for Text Encodings. String encoding conversion may not always succeed because some character information may not map from one encoding to another. For instance, if you attempted to convert the Latin letter wynn (U+01bf) from UTF-8 to ASCII, the conversion will fail because ASCII does not contain a code point for wynn. When the conversion process fails, the results are OS-dependant. For instance, the above conversion on Windows will result in a conversion of U+01bf to U+003f, more commonly known as the "question mark."

REALstring REALConvertString(REALstring str, unsigned long encoding);

  • str: the original REALstring object to convert
  • encoding: the encoding value the resulting string should have

Return Value a REALstring object whose contents have been converted from the original encoding to the new encoding


Beware of inadverant memory leaks from code like the following:

// Do not do this! It leaks memory!
someStr = REALConvertString( someStr, someEncoding );

Instead, your code should look like this:
// This is the correct way to convert encoding and write over the original string.
REALstring temp = REALConvertString( someStr, someEncoding );
REALUnlockString( someStr );
someStr = temp;

REALWin32CodePageToEncoding

Converts a Windows "code page" identifier into a REALbasic encoding value. This encoding value can then be used in called to define or convert encodings. For instance, if you wanted to specify the ANSI Arabic (Windows-1256) encoding, you would call REALWin32CodePageToEncoding( 1256 ) to convert it into an encoding REALbasic can work with. A list of common Win32 code pages can be found on MSDN ().

unsigned long REALWin32CodePageToEncoding( unsigned long codePage );

  • codePage: the Win32 code page identifier to convert

Return Value the REALbasic encoding value based on the Win32 code page


REALLockString

This method retains a reference to a REALbasic string object so that it will not be released by the framework's memory manager. You should lock a string any time your plugin holds a permanent reference to it. For instance, if you have a property setter which holds onto the string reference, then you should lock the string before exiting the method.

void REALLockString( REALstring str );

  • str: the string to lock

Return Value none


Beware of inadvertant memory leaks when "exchanging" locks on strings, like the following:

// Wrong (forgets to unlock old string)
REALLockString( s );
mStoredStringRef = s;

// Wrong (improper lock exchange order)
REALUnlockString( mStoredStringRef );
REALLockString( s );
mStoredStringRef = s;

// Correct
REALLockString( s );
REALUnlockString( mStoredStringRef );
mStoredStringRef = s;

REALUnlockString

This method releases a reference to a REALbasic string object so that it can be released by the framework's memory manager if it's appropriate. You should unlock a string any time your plugin no longer needs to hold a reference to it. For instance, if you have a property setter while holds onto the string reference, then you should unlock the old stored reference since you no longer require it.

void REALUnlockString( REALstring str );

  • str: the string to unlock

Return Value none


Beware of inadvertant memory leaks when "exchanging" locks on strings, like the following:

// Wrong (forgets to unlock old string)
REALLockString( s );
mStoredStringRef = s;

// Wrong (improper lock exchange order)
REALUnlockString( mStoredStringRef );
REALLockString( s );
mStoredStringRef = s;

// Correct
REALLockString( s );
REALUnlockString( mStoredStringRef );
mStoredStringRef = s;

Text Encodings

In REALbasic, text encodings are represented by a TextEncoding object. A string has an associated TextEncoding (or nil, if there is no associated encoding). However, REALbasic plugins do not deal with text encodings as objects. Instead, the encoding is specified by a 32-bit integer value, or "encoding value." When you query for an encoding, or assign one to a string, you generally use the encoding value instead of an object. These values are Mac OS TextEncoding (or CFStringEncoding) values, which can be found in the Apple header files. However, Win32 code page encodings can be converted into encoding values by way of REALWin32CodePageToEncoding. The plugin sdk does not document the complete list of text encoding values. Instead, it provides you with the most common encodings as constants, as follows:


  • kREALTextEncodingUnknown (0xFFFF)
  • kREALTextEncodingASCII (0x0600)
  • kREALTextEncodingUTF8 (0x08000100)
  • kREALTextEncodingUTF16 (0x0100)
  • kREALTextEncodingUTF32 (0x0c000100)

However, a more complete list of encoding values is:


  • MacRoman (0)
  • MacJapanese (1)
  • MacChineseTrad (2)
  • MacKorean (3)
  • MacArabic (4)
  • MacHebrew (5)
  • MacGreek (6)
  • MacCyrillic (7)
  • MacDevanagari (9)
  • MacGurmukhi (10)
  • MacGujarati (11)
  • MacOriya (12)
  • MacBengali (13)
  • MacTamil (14)
  • MacTelugu (15)
  • MacKannada (16)
  • MacMalayalam (17)
  • MacSinhalese (18)
  • MacBurmese (19)
  • MacKhmer (20)
  • MacThai (21)
  • MacLaotian (22)
  • MacGeorgian (23)
  • MacArmenian (24)
  • MacChineseSimp (25)
  • MacTibetan (26)
  • MacMongolian (27)
  • MacEthiopic (28)
  • MacCentralEurRoman (29)
  • MacVietnamese (30)
  • MacExtArabic (31) /* The following use script code 0) smRoman*/
  • MacSymbol (33)
  • MacDingbats (34)
  • MacTurkish (35)
  • MacCroatian (36)
  • MacIcelandic (37)
  • MacRomanian (38)
  • MacCeltic (39)
  • MacGaelic (40)
  • UnicodeDefault (0x0100) /* Meta-value) should never appear in a table.*/
  • ISOLatin1 (0x0201) /*
  • ISO 8859-1*/
  • ISOLatin2 (0x0202) /*
  • ISO 8859-2*/
  • ISOLatin3 (0x0203) /*
  • ISO 8859-3*/
  • ISOLatin4 (0x0204) /*
  • ISO 8859-4*/
  • ISOLatinCyrillic (0x0205) /*
  • ISO 8859-5*/
  • ISOLatinArabic (0x0206) /*
  • ISO 8859-6) (ASMO 708) =DOS CP 708*/
  • ISOLatinGreek (0x0207) /*
  • ISO 8859-7*/
  • ISOLatinHebrew (0x0208) /*
  • ISO 8859-8*/
  • ISOLatin5 (0x0209) /*
  • ISO 8859-9*/
  • ISOLatin6 (0x020A) /*
  • ISO 8859-10 */
  • ISOLatin7 (0x020D) /*
  • ISO 8859-13) Baltic Rim */
  • ISOLatin8 (0x020E) /*
  • ISO 8859-14) Celtic */
  • ISOLatin9 (0x020F) /*
  • ISO 8859-15) 8859-1 changed for EURO & CP1252 letters */
  • DOSLatinUS (0x0400) /* code page 437*/
  • DOSGreek (0x0405) /* code page 737 (formerly code page 437G)*/
  • DOSBalticRim (0x0406) /* code page 775*/
  • DOSLatin1 (0x0410) /* code page 850) \"Multilingual\"*/
  • DOSGreek1 (0x0411) /* code page 851*/
  • DOSLatin2 (0x0412) /* code page 852) Slavic*/
  • DOSCyrillic (0x0413) /* code page 855) IBM Cyrillic*/
  • DOSTurkish (0x0414) /* code page 857) IBM Turkish*/
  • DOSPortuguese (0x0415) /* code page 860*/
  • DOSIcelandic (0x0416) /* code page 861*/
  • DOSHebrew (0x0417) /* code page 862*/
  • DOSCanadianFrench (0x0418) /* code page 863*/
  • DOSArabic (0x0419) /* code page 864*/
  • DOSNordic (0x041A) /* code page 865*/
  • DOSRussian (0x041B) /* code page 866*/
  • DOSGreek2 (0x041C) /* code page 869) IBM Modern Greek*/
  • DOSThai (0x041D) /* code page 874) also for Windows*/
  • DOSJapanese (0x0420) /* code page 932) also for Windows; Shift-JIS with additions*/
  • DOSChineseSimplif (0x0421) /* code page 936) also for Windows; was EUC-CN) now GBK (EUC-CN extended)*/
  • DOSKorean (0x0422) /* code page 949) also for Windows; Unified Hangul Code (EUC-KR extended)*/
  • DOSChineseTrad (0x0423) /* code page 950) also for Windows; Big-5*/
  • WindowsLatin1 (0x0500) /* code page 1252*/
  • WindowsANSI (0x0500) /* code page 1252 (alternate name)*/
  • WindowsLatin2 (0x0501) /* code page 1250) Central Europe*/
  • WindowsCyrillic (0x0502) /* code page 1251) Slavic Cyrillic*/
  • WindowsGreek (0x0503) /* code page 1253*/
  • WindowsLatin5 (0x0504) /* code page 1254) Turkish*/
  • WindowsHebrew (0x0505) /* code page 1255*/
  • WindowsArabic (0x0506) /* code page 1256*/
  • WindowsBalticRim (0x0507) /* code page 1257*/
  • WindowsVietnamese (0x0508) /* code page 1258*/
  • WindowsKoreanJohab (0x0510) /* code page 1361) for
  • Windows NT*/
  • US_ASCII (0x0600)
  • ShiftJIS (0x0A01) /* plain Shift-JIS*/
  • KOI8_R (0x0A02) /* Russian internet standard*/
  • MacRomanLatin1 (0x0A04) /* Mac OS Roman permuted to align with ISO Latin-1*/
Personal tools