Character entities are a method for inserting special characters into HTML documents. For example, characters that might not appear on your keyboard, such as © or characters that have a special meaning in HTML, such as < which a browser will interpret as the start of an HTML tag!

There are three methods for inserting these special characters into HTML documents. Each have their own advantages and disadvantages.

Entities

Entities are named references to characters. An entity begins with an ampersand (&) and ends with a semicolon (;). For example, the following is an entity for the copyright symbol:

&copy;

This is displayed in a browser like this:

©

The smarter readers may notice that this means that the & character is a special character in HTML, so whenever you use an & character in your HTML documents (apart from to start an entity or numeric character reference), you should use an entity for & instead. The entity for & is &amp;.

Entities are arguably better than numeric character references because they're quite easy to remember, but in XHTML, entities are an optional feature of browsers, so not all browsers will be able to understand them. The 5 named entities that browsers are guaranteed to understand are:

&lt;
The "less than" symbol (<). Whenever you need to use a less than symbol in the flow of text, use this entity instead. Otherwise your less than symbol will be interpreted as the beginning of a tag.
&gt;
The "greater than" symbol (>). Similarly.
&amp;
The ampersand (&). Whenever you need to use an ampersand in the flow of text, use this entity instead. Otherwise your less than symbol will be interpreted as the beginning of another entity!
&quot;
Quote marks ("). In the normal flow of text, you should not need to use this entity, but this can be very handy if an attribute contains a quote character, for example, if we wanted to insert a link to a page that had a quote character in its address.
&apos;
An apostrophe ('). Similarly, this will come is handy if you need to insert an apostrophe into an attribute.

Also, check out the official list of all other entities in HTML or this slightly more user-friendly table of entities.

Numeric Character References

Although harder to remember, numeric references give you access to a much larger set of characters to choose from and are more likely to work in XHTML browsers.

These work similarly, they start with an ampersand (&) and end with a semicolon (;), but in between instead of an easy to remember name, there is a hash sign (#) possibly followed by a lowercase x (if the reference is in hexadecimal) and then followed by a number.

For example, the copyright symbol is &#169; in decimal or & #xA9; in hexadecimal. Generally browser support is stronger for the decimal variants.

To find a full list of these numeric references, see the PDF charts at Unicode website.

Tricks with Entities

Disguising Your E-mail Address

An @-sign can be encoded as &#x40; or &#64;. This can be used to disguise e-mail addresses from the software spammers use to collect e-mail addresses from web pages.

For example:

tobyink&#64;goddamn.co.uk

will get displayed to the user as:

[email protected]

Ampersands in addresses

A lot of web pages have an & sign in their address. For example, . Remember to replace the & signs in links with &amp;, but only in HTML and XHTML documents — not for general usage!

Anyway, that is enough about entities which are a rather boring topic. On to the next chapter!