(rfc uri) - Parse and construct URIs

Library (rfc uri)

This library provides RFC3986 'URI Generic Syntax' procedures.

Function uri-parse uri

uri must be string.

Parses given uri and returns following 7 values;

Following examples are from RFC3986 text;

   foo://example.com:8042/over/there?name=ferret#nose
   \_/   \______________/\_________/ \_________/ \__/
    |           |            |            |        |
 scheme     authority       path        query   fragment
    |   _____________________|__
   / \ /                        \
   urn:example:animal:ferret:nose

authority = [ user-info "@" ] host [ ":" port ]

If given uri does not contain the part described above, it will be #f. ex)

(uri-parse "http://localhost")
(values http #f localhost #f #f #f #f)

Function uri-scheme&specific uri

uri must be string.

Parse given uri into scheme and rest. Returns the 2 values.

Function uri-decompose-hierarchical specific

specific must be string.

specific is a URI without scheme. For example, the specific of following URI 'http://localhost/foo.html' if '//localhost/foo.html'.

Parse given specific into 4 values authority, path, query and fragment.

If the specific does not contain the part, it will be #f.

Function uri-decompose-authority authority

authority must be string.

Parse given authority into 3 values, user-info, host and post.

If the authority does not contain the part, it will be #f.

Function uri-decode in out :key (cgi-decode #f)

in must be binary input port.

out must binary output port.

Reads and decodes given in and put the result into out.

If the keyword argument cgi-decode is #t, the procedure decodes #x2b('+') to #x20('#\space').

Function uri-decode-string string :key (encoding 'utf-8) (cgi-decode #f)

Decodes given string and returns decoded string.

Function uri-encode in out :key (noescape rfc3986-unreserved-char-set) (upper-case #t)

in must be binary input port.

out must binary output port.

Reads and encodes given in and put the result into out.

The keyword argument noescape specifies which character must be escaped.

The keyword argument upper-case specifies the result case of encoded value. If the value is true value then it encodes to upper case (default), otherwise lower case.

Function uri-encode-string string :key (encoding 'utf-8) :allow-other-keys

Encodes given string and returns encoded string.

Function uri-compose :key (scheme #f) (userinfo #f) (host #f) (port #f) (authority #f) _

_ (path #f) (path* #f) (query #f) (fragment #f) (specific #f)

Composes URI from given arguments.

If all keyword arguments are #f, the procedure returns empty string.

The procedure put priority on bigger chunk of URI part. For example, if keyword argument specific is specified, the procedure uses only scheme and specific. Following describes the priority hierarchy;

_scheme__specific_  +- _authority_       +- _userinfo_       +- _host_       +- _port_  +- _path\*_       +- _path_       +- _query_       +- _fragment_
Function uri-merge base-uri relative-uri1 relative-uri2 ...

Merges given relative-uris to base-uri according to RFC 3986 section 5.

Variable *rfc3986-unreserved-char-set*
Variable *rfc2396-unreserved-char-set*

Charsets which contains no escape needed characters.

There is slight difference between RFC2396 and RFC3986. This library uses RFC3986 charset by default to encode.