URI
A Uniform Resource Identifier (URI) is a unique sequence of characters that identifies a logical or physical resource used by web technologies.
Common syntax of URI:
[scheme:]scheme-specific-part[#fragment]
An absolute URI specifies a scheme; a URI that is not absolute is said to be relative.
An opaque URI is an absolute URI whose scheme-specific part does not begin with a slash character ('/'). For example,
- mailto:java-net@java.sun.com - mail uri
- urn:isbn:096139210x - URN
- tel:1-408-555-5555 - phone number
A hierarchical URI is either an absolute URI whose scheme-specific part begins with a slash character, or a relative URI, that is, a URI that does not specify a scheme. For example,
- https://socode4.com/articles/en/html/links
- articles/en/html/links
- ../../../links
- file:///~/download/sitemap.xml
Syntax of hierarchical URI:
[scheme:][//authority][path][?query][#fragment]
where authority is [username:password@]host[:port]
The path component of a hierarchical URI is itself said to be absolute if it begins with a slash character ('/'); otherwise it is relative. The path of a hierarchical URI that is either absolute or specifies an authority is always absolute.
characters in URI
Following characters are allowed in URI
- a-zA-Z, i.e. english characters
- 0-9, i.e. digits
- -
- _
- .
- ~
Following charecter reserved in URI and may have special meaning. For example, / is used to separate different parts of a URL.
! | # | $ | & | ' | ( | ) | * | + | , | / | : | ; | = | ? | @ | [ | ] |
All other characters are represented as %xy, where xy is the two-digit hexadecimal representation of the byte. The recommended encoding scheme to use is UTF-8.
By this way you also can escape reserved characters.
There is note when URL is used in HTML form. Make sure to encode whitespace using "+" or "%20" in the query string, and using "%20" within the rest of the URL.