Sagittarius Users' Reference

Unicode

[R6RS] The procedures exported by the (rnrs unicode (6))library provide access to some aspects of the Unicode semantics for characters and strings: category information, case-independent comparisons, case mappings, and normalization.

Some of the procedures that operate on characters or strings ignore the difference between upper case and lower case. These procedures have "-ci" (for "case insensitive") embedded in their names.

Characters

Function char-upcase char

Function char-downcase char

Function char-titlecase char

Function char-foldcase char

[R6RS] These procedures take a character argument and return a character result. If the argument is an upper-case or title-case character, and if there is a single character that is its lower-case form, then char-downcasereturns that character. If the argument is a lower-case or title-case character, and there is a single character that is its upper-case form, then char-upcasereturns that character. If the argument is a lower-case or upper-case character, and there is a single character that is its title-case form, then char-titlecasereturns that character. If the argument is not a title-case character and there is no single character that is its title-case form, then char-titlecasereturns the upper-case form of the argument. Finally, if the character has a case-folded character, then char-foldcase returns that character. Otherwise the character returned is the same as the argument. For Turkic characters İ (#\x130) and ı (#\x131), char-foldcase behaves as the identity function; otherwise char-foldcase is the same as char-downcase composed with char-upcase.

Function char-ci=? char1 char2 char3 ...

Function char-ci>? char1 char2 char3 ...

Function char-ci<? char1 char2 char3 ...

Function char-ci>=? char1 char2 char3 ...

Function char-ci<=? char1 char2 char3 ...

[R6RS] These procedures are similar to char=?, etc., but operate on the case-folded versions of the characters.

Function char-alphabetic? char

Function char-numeric? char

Function char-whitespace? char

Function char-upper-case? char

Function char-lower-case? char

Function char-title-case? char

[R6RS] These procedures return #t if their arguments are alphabetic, numeric, whitespace, upper-case, lower-case, or title-case characters, respectively; otherwise they return #f.

A character is alphabetic if it has the Unicode "Alphabetic" property. A character is numeric if it has the Unicode "Numeric" property. A character is whitespace if has the Unicode "White_Space" property. A character is upper case if it has the Unicode "Uppercase" property, lower case if it has the "Lowercase" property, and title case if it is in the Lt general category.

Function char-general-category char

[R6RS] Returns a symbol representing the Unicode general category of char, one of Lu, Ll, Lt, Lm, Lo, Mn, Mc, Me, Nd, Nl, No, Ps, Pe, Pi, Pf, Pd, Pc, Po, Sc, Sm, Sk, So, Zs, Zp, Zl, Cc, Cf, Cs, Co, or Cn.

Strings

Function string-upcase string start end

Function string-downcase string start end

Function string-titlecase string start end

Function string-foldcase string start end

[R6RS+][SRFI-13] These procedures take a string argument and return a string result. They are defined in terms of Unicode's locale-independent case mappings from Unicode scalar-value sequences to scalar-value sequences. In particular, the length of the result string can be different from the length of the input string. When the specified result is equal in the sense of string=? to the argument, these procedures may return the argument instead of a newly allocated string.

The string-upcase procedure converts a string to upper case; string-downcase converts a string to lower case.

The string-foldcase procedure converts the string to its case-folded counterpart, using the full case-folding mapping, but without the special mappings for Turkic languages.

The string-titlecase procedure converts the first cased character of each word via char-titlecase, and downcases all other cased characters.

If the optional argument start and end are given, these must be exact integer and the procedures will first substring the given string with range start and end then convert it.

Function string-ci=? string1 string2 string3 ...

Function string-ci>? string1 string2 string3 ...

Function string-ci<? string1 string2 string3 ...

Function string-ci>=? string1 string2 string3 ...

Function string-ci<=? string1 string2 string3 ...

[R6RS] These procedures are similar to string=?, etc., but operate on the case-folded versions of the strings.

Function string-nfd string

Function string-nfkd string

Function string-nfc string

Function string-nfkc string

[R6RS] These procedures take a string argument and return a string result, which is the input string normalized to Unicode normalization form D, KD, C, or KC, respectively. When the specified result is equal in the sense of string=?to the argument, these procedures may return the argument instead of a newly allocated string.