Many applications deal with blocks of binary data by accessing them in various
ways-extracting signed or unsigned numbers of various sizes. Therefore, the
`(rnrs bytevectors (6))`

library provides a single type for blocks of binary
data with multiple ways to access that data. It deals with integers and
floating-point representations in various sizes with specified endianness.

Bytevectorsare objects of a disjoint type. Conceptually, a bytevector represents a sequence of 8-bit bytes. The description of bytevectors uses the term byte for an exact integer object in the interval { - 128, ..., 127} and the term octet for an exact integer object in the interval {0, ..., 255}. A byte corresponds to its two's complement representation as an octet.

The length of a bytevector is the number of bytes it contains. This number is fixed. A valid index into a bytevector is an exact, non-negative integer object less than the length of the bytevector. The first byte of a bytevector has index 0; the last byte has an index one less than the length of the bytevector.

Generally, the access procedures come in different flavors according to the size of the represented integer and the endianness of the representation. The procedures also distinguish signed and unsigned representations. The signed representations all use two's complement.

Like string literals, literals representing bytevectors do not need to be quoted:

`#vu8(12 23 123)`

#vu8(12 23 123)
Library
(rnrs bytevectors (6))

[R6RS] This library provides a single type for blocks of binary data with multiple ways to access that data.

Macro
endianness
*symbol*

[R6RS] The name of *symbol* must be a symbol describing an endianness.
`(endianness _symbol_)`

evaluates to the symbol named *symbol*.
Whenever one of the procedures operating on bytevectors accepts an endianness as
an argument, that argument must be one of these symbols. It is a syntax violation
for symbol to be anything other than an endianness symbol supported by the Sagittarius.

Currently, Sagittarius supports these symbols; `big`

, `little`

and `native`

.

Function
native-endianness

[R6RS] Returns the endianness symbol associated platform endianness. This may be a symbol either big or little.

Function
bytevector?
*obj*

[R6RS] Returns #t if *obj* is a bytevector, otherwise returns #f.

Function
make-bytevector
*k*
*:optional*
*fill*

[R6RS] Returns a newly allocated bytevector of *k* bytes.

If the *fill* argument is missing, the initial contents of the returned
bytevector are 0.

If the *fill* argument is present, it must be an exact integer object in the
interval {-128, ..., 255} that specifies the initial value for the bytes of the
bytevector: If *fill* is positive, it is interpreted as an octet; if it is
negative, it is interpreted as a byte.

Function
bytevector-length
*bytevector*

[R6RS] Returns, as an exact integer object, the number of bytes in
*bytevector*.

Function
bytevector=?
*bytevector1*
*bytevector2*

[R6RS] Returns #t if *bytevector1* and *bytevector2* are
equal-that is, if they have the same length and equal bytes at all valid
indices. It returns #f otherwise.

[R6RS+] The *fill* argument is as in the description of the
`make-bytevector`

procedure. The `bytevector-fill!`

procedure stores
*fill* in every element of *bytevector* and returns unspecified values.
Analogous to `vector-fill!`

.

If optional arguments *start* or *end* is given, then the procedure
restricts the range of filling from *start* to *end* (exclusive) index
of *bytevector*. When *end* is omitted then it uses the length of the
given bytevector.

[R6RS] *Source* and *target* must be bytevectors. *Source-start*,
*target-start*, and *k* must be non-negative exact integer objects that satisfy

0 <= *source-start* <= *source-start* + *k* <= _source-length_0 <= *target-start* <= *target-start* + *k* <= _target-length_where *source-length* is the length of *source* and _target-length_is the length of *target*.

The `bytevector-copy!`

procedure copies the bytes from *source* at indices

*source-start*, ... *source-start* + *k* - 1

to consecutive indices in *target* starting at *target-index*.

This returns unspecified values.

Function
bytevector-copy
*bytevector*
_

_ *:optional* *(start* *0)* *(end* *-1)*

[R6RS+] Returns a newly allocated copy of *bytevector*.

If optional argument *start* was given, the procedure copies from the given
*start* index.

If optional argument *end* was given, the procedure copies to the given
*end* index (exclusive).

Function
bytevector-u8-ref
*bytevector*
*k*

Function
bytevector-s8-ref
*bytevector*
*k*

[R6RS] *K* must be a valid index of *bytevector*.

The `bytevector-u8-ref`

procedure returns the byte at index *k* of
*bytevector*, as an octet.

The `bytevector-s8-ref`

procedure returns the byte at index *k* of
*bytevector*, as a (signed) byte.

Function
bytevector-s8-set!
*bytevector*
*k*
*byte*

[R6RS] *K* must be a valid index of *bytevector*.

The `bytevector-u8-set!`

procedure stores *octet* in element _k_of *bytevector*.

The `bytevector-s8-set!`

procedure stores the two's-complement
representation of *byte* in element *k* of *bytevector*.

Both procedures return unspecified values.

Function
bytevector->u8-list
*bytevector*

Function
u8-list->bytevector
*bytevector*

[R6RS] *List* must be a list of octets.

The `bytevector->u8-list`

procedure returns a newly allocated list of the
octets of *bytevector* in the same order.

The `u8-list->bytevector`

procedure returns a newly allocated bytevector
whose elements are the elements of list *list*, in the same order. It is
analogous to `list->vector`

.

[R6RS] *Size* must be a positive exact integer object.
*K*, ..., *k* + *size* - 1 must be valid indices of *bytevector*.

The `bytevector-uint-ref`

procedure retrieves the exact integer object
corresponding to the unsigned representation of size *size* and specified
by *endianness* at indices *k*, ..., *k* + *size* - 1.

The `bytevector-sint-ref`

procedure retrieves the exact integer object
corresponding to the two's-complement representation of size *size* and
specified by *endianness* at indices *k*, ..., *k* + *size* - 1.

For `bytevector-uint-set!`

, *n* must be an exact integer object in the
interval _{0, ..., 256 ^ "size" - 1}_The `bytevector-uint-set!`

procedure stores the unsigned representation of
size *size* and specified by *endianness* into *bytevector* at indices
*k*, ..., *k* + *size* - 1.

For `bytevector-sint-set!`

, *n* must be an exact integer object in the
interval *{-256 ^ "size" / 2, ..., 256 ^ "size" / 2 - 1}*.
`bytevector-sint-set!`

stores the two's-complement representation of size
*size* and specified by *endianness* into *bytevector* at indices
*k*, ..., *k* + *size* - 1.

The `...-set!`

procedures return unspecified values.

[R6RS] *Size* must be a positive exact integer object. For
`uint-list->bytevector`

, *list* must be a list of exact integer objects
in the interval *{0, ..., 256 ^ "size" - 1}*. For `sint-list->bytevector`

,
*list* must be a list of exact integer objects in the interval
*{-256 ^ "size"/2, ..., 256 ^ "size"/2 - 1}*. The length of _bytevector_or, respectively, of *list* must be divisible by *size*.

These procedures convert between lists of integer objects and their consecutive
representations according to *size* and *endianness* in the _bytevector_objects in the same way as `bytevector->u8-list`

and `u8-list->bytevector`

do for one-byte representations.

[R6RS] *K* must be a valid index of *bytevector*; so must *k* + 1.
For `bytevector-u16-set!`

and `bytevector-u16-native-set!`

, _n_must be an exact integer object in the interval *{0, ..., 2 ^ 16 - 1}*.
For `bytevector-s16-set!`

and `bytevector-s16-native-set!`

, _n_must be an exact integer object in the interval *{-2 ^ 15, ..., 2 ^ 15 - 1}*.

These retrieve and set two-byte representations of numbers at indices _k_and *k* + 1, according to the endianness specified by *endianness*.
The procedures with `u16`

in their names deal with the unsigned representation;
those with `s16`

in their names deal with the two's-complement representation.

The procedures with `native`

in their names employ the native endianness,
and work only at aligned indices: *k* must be a multiple of 2.

The `...-set!`

procedures return unspecified values.

[R6RS] *K* must be a valid index of *bytevector*; so must *k* + 3.
For `bytevector-u32-set!`

and `bytevector-u32-native-set!`

, _n_must be an exact integer object in the interval *{0, ..., 2 ^ 32 - 1}*.
For `bytevector-s32-set!`

and `bytevector-s32-native-set!`

, _n_must be an exact integer object in the interval *{-2 ^ 31, ..., 2 ^ 32 - 1}*.

These retrieve and set two-byte representations of numbers at indices _k_and *k* + 3, according to the endianness specified by *endianness*.
The procedures with `u32`

in their names deal with the unsigned representation;
those with `s32`

in their names deal with the two's-complement representation.

The procedures with `native`

in their names employ the native endianness,
and work only at aligned indices: *k* must be a multiple of 4.

The `...-set!`

procedures return unspecified values.

[R6RS] *K* must be a valid index of *bytevector*; so must *k* + 7.
For `bytevector-u64-set!`

and `bytevector-u64-native-set!`

, _n_must be an exact integer object in the interval *{0, ..., 2 ^ 64 - 1}*.
For `bytevector-s64-set!`

and `bytevector-s64-native-set!`

, _n_must be an exact integer object in the interval *{-2 ^ 63, ..., 2 ^ 64 - 1}*.

These retrieve and set two-byte representations of numbers at indices _k_and *k* + 7, according to the endianness specified by *endianness*.
The procedures with `u64`

in their names deal with the unsigned representation;
those with `s64`

in their names deal with the two's-complement representation.

The procedures with `native`

in their names employ the native endianness,
and work only at aligned indices: *k* must be a multiple of 8.

The `...-set!`

procedures return unspecified values.

[R6RS] *K*, …, *k* + 3 must be valid indices of *bytevector*.
For `bytevector-ieee-single-native-ref`

, *k* must be a multiple of 4.

These procedures return the inexact real number object that best represents the
IEEE-754 single-precision number represented by the four bytes beginning at index
*k*.

[R6RS] *K*, …, *k* + 7 must be valid indices of *bytevector*.
For `bytevector-ieee-double-native-ref`

, *k* must be a multiple of 8.

These procedures return the inexact real number object that best represents the
IEEE-754 double-precision number represented by the four bytes beginning at index
*k*.

[R6RS] *K*, …, *k* + 3 must be valid indices of *bytevector*.
For `bytevector-ieee-single-native-set!`

, *k* must be a multiple of 4.

These procedures store an IEEE-754 single-precision representation of _x_into elements *k* through *k* + 3 of *bytevector*, and return
unspecified values.

[R6RS] *K*, …, *k* + 7 must be valid indices of *bytevector*.
For `bytevector-ieee-double-native-set!`

, *k* must be a multiple of 8.

These procedures store an IEEE-754 double-precision representation of _x_into elements *k* through *k* + 7 of *bytevector*, and return
unspecified values.

This section describes procedures that convert between strings and bytevectors containing Unicode encodings of those strings. When decoding bytevectors, encoding errors are handled as with the replace semantics of textual I/O: If an invalid or incomplete character encoding is encountered, then the replacement character U+FFFD is appended to the string being generated, an appropriate number of bytes are ignored, and decoding continues with the following bytes.

[R6RS+] [R7RS] Returns a newly allocated (unless empty) bytevector that
contains the UTF-8 encoding of the given *string*.

If the optional argument *start* is given, the procedure converts given
string from *start* index (inclusive).

If the optional argument *end* is given, the procedure converts given
string to *end* index (exclusive).

These optional arguments must be fixnum if it's given.

[R6RS] If *endianness* is specified, it must be the symbol `big`

or the symbol `little`

. The `string->utf16`

procedure returns a newly
allocated (unless empty) bytevector that contains the UTF-16BE or UTF-16LE
encoding of the given *string* (with no byte-order mark). If _endianness_is not specified or is `big`

, then UTF-16BE is used. If *endianness* is
`little`

, then UTF-16LE is used.

[R6RS] If *endianness* is specified, it must be the symbol `big`

or the symbol `little`

. The `string->utf32`

procedure returns a newly
allocated (unless empty) bytevector that contains the UTF-32BE or UTF-32LE
encoding of the given *string* (with no byte-order mark). If _endianness_is not specified or is `big`

, then UTF-32BE is used. If *endianness* is
`little`

, then UTF-32LE is used.

Function
utf8->string
*bytevector*

[R6RS] Returns a newly allocated (unless empty) string whose character
sequence is encoded by the given *bytevector*.

If the optional argument *start* is given, the procedure converts given
string from *start* index (inclusive).

If the optional argument *end* is given, the procedure converts given
string to *end* index (exclusive).

These optional arguments must be fixnum if it's given.

[R6RS] *Endianness* must be the symbol `big`

or the symbol
`little`

. The `utf16->string`

procedure returns a newly allocated
(unless empty) string whose character sequence is encoded by the given
*bytevector*. *Bytevector* is decoded according to UTF-16BE or UTF-16LE:
If *endianness-mandatory?* is absent or #f, `utf16->string`

determines
the endianness according to a UTF-16 BOM at the beginning of _bytevector_if a BOM is present; in this case, the BOM is not decoded as a character. Also
in this case, if no UTF-16 BOM is present, *endianness* specifies the endianness
of the encoding. If *endianness-mandatory?* is a true value, _endianness_specifies the endianness of the encoding, and any UTF-16 BOM in the encoding is
decoded as a regular character.

[R6RS] *Endianness* must be the symbol `big`

or the symbol
`little`

. The `utf32->string`

procedure returns a newly allocated
(unless empty) string whose character sequence is encoded by the given
*bytevector*. *Bytevector* is decoded according to UTF-32BE or UTF-32LE:
If *endianness-mandatory?* is absent or #f, `utf32->string`

determines
the endianness according to a UTF-32 BOM at the beginning of _bytevector_if a BOM is present; in this case, the BOM is not decoded as a character. Also
in this case, if no UTF-32 BOM is present, *endianness* specifies the endianness
of the encoding. If *endianness-mandatory?* is a true value, _endianness_specifies the endianness of the encoding, and any UTF-32 BOM in the encoding is
decoded as a regular character.