URI

A Uniform Resource Identifier (URI) is a unique sequence of characters that identifies a logical or physical resource used by web technologies.

Common syntax of URI:

[scheme:]scheme-specific-part[#fragment]

An absolute URI specifies a scheme; a URI that is not absolute is said to be relative.

An opaque URI is an absolute URI whose scheme-specific part does not begin with a slash character ('/'). For example,

  • mailto:java-net@java.sun.com - mail uri
  • urn:isbn:096139210x - URN
  • tel:1-408-555-5555 - phone number

A hierarchical URI is either an absolute URI whose scheme-specific part begins with a slash character, or a relative URI, that is, a URI that does not specify a scheme. For example,

  • https://socode4.com/articles/en/html/links
  • articles/en/html/links
  • ../../../links
  • file:///~/download/sitemap.xml

Syntax of hierarchical URI:

[scheme:][//authority][path][?query][#fragment]
 where authority is [username:password@]host[:port]

The path component of a hierarchical URI is itself said to be absolute if it begins with a slash character ('/'); otherwise it is relative. The path of a hierarchical URI that is either absolute or specifies an authority is always absolute.

characters in URI

Following characters are allowed in URI

  • a-zA-Z, i.e. english characters
  • 0-9, i.e. digits
  • -
  • _
  • .
  • ~

Following charecter reserved in URI and may have special meaning. For example, / is used to separate different parts of a URL.

! # $ & ' ( ) * + , / : ; = ? @ [ ]

All other characters are represented as %xy, where xy is the two-digit hexadecimal representation of the byte. The recommended encoding scheme to use is UTF-8.

By this way you also can escape reserved characters.

There is note when URL is used in HTML form. Make sure to encode whitespace using "+" or "%20" in the query string, and using "%20" within the rest of the URL.