HTML syntax overview

Next terms are used when working with HTML.

term description
document type Defines document type or version of HTML.
element Represents an individual component of an HTML document.
tags Define begining and ending of element.
attribute Provides extra information about element.
HTML entity Escaped character.
CDATA Defines unescaped text, mostly is used in XHTML.
comment Some text that invisible for user.

Document type

Since html 5 you can don't worry about this. Just start html page with

<!DOCTYPE html>

Elements and tags

Commonly element consist from:

  • opening tag - some name in angle brackets, like
  • content - some text, that can contain nested elements
  • closing tag - same name as in opening tag with slash, like

Unlike XML, some elements in HTML can be used without an end tag, such as br. Most browsers understand empty elements like <div />.


    <p> Some paragraph text <br> Some text of paragraph.</p>


Attributes provide extra information about element. You can consider them as properties of element.

<div id="myUniqId" class="mt-2 mb-0" >
    <p>Some paragraph text <br> Some text of paragraph. </p>
Some anchors  <a href="#myUniqId">back to paragraph</a>

HTML entities

Entity allow escape character in html code. It begins from ampersand & and end by semicolon ;. The most useful entities are:

  • &quot; - double quotation mark " (code #34)
  • &apos; - apostrophe ' (code #39)
  • &amp; - ampersand &(code #38)
  • &lt; - less than < (code #60)
  • &gt; - greater than > (code #62)
  • &nbsp; - non-breaking space (code #160)
  • &copy; - copyright sign © (code #169)

You can specify any character:

  • &#XXXX; - where XXXX unicode of character
  • &#xXXXX; - where XXXX hexadecimal unicode of character

Characters data

It is heritage of xml and obsolete for HTML 5. CDATA defines section where browser don't must parse any tags.

<![CDATA[ some text that can contains < > & and etc. ]]>

<!-- in the old days a js-code added like this -->
<script type="text/javascript">


Comments are some text that invisible for user.

<!-- a comment text, can be multi-lines