Line25

10 HTML Entity Crimes You Really Shouldn’t Commit

10 HTML Entity Crimes You Really Shouldn’t Commit
Home » Articles » 10 HTML Entity Crimes You Really Shouldn’t Commit

June 30, 2011

Line25 is reader supported. At no cost to you an affiliate commission may be earned when a purchase is made through various links on our site. Learn more

It has been over a couple of years since I posted my HTML tag and usability crimes posts, both of which are amongst the most popular articles here on Line25. There’s something about this title people just can’t resist! Let’s take a look at ten crimes you may be committing in your HTML content. These won’t exactly land you a life sentence, but I bet almost every one of us will be guilty of at least one of these petty crimes.

Crime 1: Not converting your ampersands

AmpersandPin

One of the most common HTML validation errors I see when checking the code behind Sites of the Week features are unconverted ampersand characters. It’s easy to simply paste in your content from an external document and forget to transform your & characters into the correct & HTML entity.

Crime 2: Making your own ellipsis

EllipsisPin

Did you know those three dots used to indicate a pause in a sentence are called an ‘ellipsis’? Rather than typing three periods or full stops it actually has it’s own glyph as the … HTML entity. The spacing between the dots in the entity is much tighter than the standard spacing between three full stops or periods. Remember, there’s only ever three dots (four in certain situations), don’t be the person who extends their ellipsis to 6+ dots………..

Crime 3: Incorrect use of the em dash

Em dashPin

I’m definitely guilty of this one myself. I can never remember when to use Em dashes over En dashes, and what’s worse I tend to stick in a plain old hyphen which makes the crime even more serious! The Em dash or — in HTML entity format is typically used to separate a break of thought in a sentence.

Crime 4: Incorrect use of the en dash

En dashPin

Similar to the Em dash crime, the En dash is another form of dash often misused in our body copy. The En dash or – in its HTML entity form is used to express a range of values or a distance, so essentially the En dash is placed where you would say the word “to” when talking about age ranges or a journey between two places.

Crime 5: DIY Copyright symbol

Copyright symbolPin

I’m sure we’re all familiar with the international copyright symbol—the letter C with a circle around it ©. Rather than use this standardised symbol, sometimes people decide to construct their own with brackets, (c), probably because they couldn’t figure out the keyboard combination to insert the symbol in Word. Thankfully it’s much easier for web designers with the easy to remember HTML entity of ©.

Crime 6: DIY Trademark symbols

Trademark symbolPin

Similar to the copyright symbol, the trademark symbol has its own HTML entity glyph as &trade;, so there’s no need to design your own from styled <sup> or <span> tags.

Crime 7: Plain text fractions

Half fractionPin

Another common addition to every day body text is the use of fractions to show quarter, half or three quarters in their shorthand, numeric format. I’m sure we’re all guilty of writing 1/4, 1/2 or 3/4 as opposed to their HTML entity counterparts of ¼, ½ & ¾ as &frac14;, &frac12; & &frac34;.

Crime 8: Plain text mathematics symbols

Divide symbolPin

Common mathematical symbols such as add, minus, multiply and divide are also misused in general body text. The add and minus symbols not so much, as they’re easily created with their designated keyboard buttons, but multiply and divide are usually seen as the letter x, an asterisk *, or a slash /. × and ÷ symbols can be easily created with the HTML entities of &times; and &divide;.

Crime 9: Supersized degree symbols

Degree symbolPin

The degree symbol is not the most common glyph used in every day body text, but it’s necessary when talking about temperature, angles or longitude/latitude map locations. We often see the degree format in disguise as a plain old letter o, but it can be created correctly using the &deg; HTML entity like so: 45°.

Crime 10: Somewhat straight curly quotes

Curly quotesPin

A massive pet peeve for many, the incorrect use of quotes in textual content is one of the most common typographic errors on earth. The standard quotation marks entered with our keyboards are the numerical type which indicate a unit of measurement such as 5′10″. When quoting people, curly quotes should be used with the &ldquo; and &rdquo; entities: “Now I, Skeletor, am master of the universe”, or for single quotes the &lsquo; and &rsquo; entities.

HTML Entity names vs entity number

Every HTML entity can be written as its name value &copy; or as its numerical value &#169;. The main advantage of using an entity name is the character is easily recognisable as the name often relates to the actual entity in question (copy = copyright). However entity names aren’t as well supported as entity numbers, which have good all round support.

Written by Iggy

Iggy is a designer who loves experimenting with new web design techniques, collating creative website designs, and writing about the latest design trends, inspiration, design freebies, and more. You can follow him on Twitter

67 Comments

Would you like to say something?

  1. Obsessive Editor says:

    One grammar crime you should not commit: “It’s” with apostrophe always means “it is.” Possessive (“belonging to it”) is always “its” without apostrophe. In the ellipsis section, you write, “it actually has it’s own glyph,” which means “it actually has it is own glyph,” which is nonsensical. Please correct.

  2. Peter J. Hart says:

    Ellipses are only ever three dots. Sometimes they come at the end of a sentence where they are followed by a period.

    The “H” in “hellip” stands for “horizontal”. A vertical ellipsis is &vellip;.

    On OSX an ellipsis is alt+;

    • David Neale says:

      “On OSX an ellipsis is al+;”

      That depends on the keyboard being used. For example, on a Spanish keyboard it is Alt-. (Alt and full stop).

  3. Violet Weed says:

    Your DYK for ellipsis is incorrect.
    As a longtime master typographer (and programmer and network engineer and published author (not of blogs or self-published ebooks, but REAL books)……. the 3-char ellipsis indicates that CONTENT HAS BEEN LEFT OUT.

  4. Chris Christofferson says:

    Very useful article! All web designers and developers, copywriters and authors etc. keen on being as professional as possible (which can ultimately only be a good thing) should have this page bookmarked.

    The … entity is a very useful one to remember, and even though I’ve been using it for a few years, occasionally three dots will slip through the net, when you’re typing at 80 words per minute, and thinking at 300 words per minute its easily done!

    So I find a periodic and very careful find & replace is sometimes the order of the day.

  5. Vijay Gupta says:

    Thanks Chris! for this Excellent post, I often overlooked some of these entities – Now I'm going to correct my old mistakes :-)

  6. Nikita Prokopov says:

    Great post, but what about hellip, there’s actually no such symbol, just three dots. It was once invented for monospaced fonts where three dots were to wide, but if you are not using monospace, you should not mess with hellip. It doesn’t add nothing to your typography.

    • Mark says:

      There's actually another as yet unmentioned reason for using &hellip; that goes well beyond how pretty (or typographically correct) it might look – and that’s accessibility. Screen readers like JAWS don’t say anything when they encounter an ellipsis, but they say “dot dot dot” when ever they come up against three periods. Which can get very tiresome if it’s reading a particularly suspenseful passage&hellip;

  7. Guillaume says:

    Or you could *just* use Linux with its keyboard variants containing all the caracters you need from the press of a few buttons.

    Ex: Copy = © (AltGR+C).

  8. Amar InfoTech says:

    Great post!. for more details about creative designing visit our website..

  9. Patricia says:

    Hmm I like your post but the heading is totally wrong; most of these examples you bring up are spelling crimes, not HTML crimes. If you're to use &mdash; or &ndash; has nothing to do with HTML, right?

    :-)

    / Patricia

  10. TechHello says:

    Wow, there's actually a few there I wasn't aware of – thanks Chris!

  11. Mike says:

    I couldn't agree more with the list. The sad thing is, that most of these should be common sense.

  12. DesignerPush says:

    Great article. These are more typography crimes than anything, but still worth considering. Thank you.

  13. Cosmin Negoita says:

    Thanks for the article Chris! Anyway, there are a lot of characters I don't use when creating a design :D

  14. Mike says:

    I think many of us as designers are guilty of all these. I know I was when I was just starting out. As someone mentioned above, more typography errors/mistakes than HTML.

    This is a very well put together post Chris! Thanks!

  15. Laurence Penney says:

    Of the ellipsis, you say “The spacing between the dots in the entity is much tighter than the standard spacing between three full stops or periods.”

    Not true! The ellipses in many good, common fonts have more generously spaced dots than three periods. Vincent Connare’s output seems to be exceptional in this regard…

    (Also in that para, there’s a redundant apostrophe in “has it’s own glyph”.)

  16. Drew says:

    Excellent coverage of an often overlooked area- especially the ellipses and quotes! Makes me want to go through my old code and fix all of my mistakes!

  17. lab says:

    Crime 11: Use of hand-made quotes at all instead of marking the quote with the correct html element q and doing the rest with css.
    This prevents you also from crime 10.5 &hellip; (<- curious who this will turn out)

  18. Paul says:

    Thanks for these!

  19. Ryan Gannon says:

    Todd: Daddy? Are you going to jail?
    Ned: We'll see, son. We'll see.

    Thankfully there is a WP plug-in that does these for me. I know I should still do it.

  20. Rushan says:

    thanks Chris, alot of new stuff here .

  21. Tim says:

    A List Apart actually has an excellent article written by Peter K Sheerin about when to use Em or En dashes.

    In short, Em dashes are for indicating a sudden break — like this one.

    En dashes are to indicate a range, like 20–25.

    The article can be found here: https://www.alistapart.com/articles/emen/

  22. nfroidure says:

    Thanks for this article. Just a word for numeric entities : it is XML compatible, so i recommend to use them instead of HTML entities.

    +1 to Julian Burgess, when in UTF8, using characters instead of entities will save your database integrity.

  23. Webdesign Eindhoven says:

    Good stuff! Thanks for sharing.

  24. ABDUL JANOO says:

    Thanks Chris it will be use full for my blog.

  25. dapad says:

    Yeah.You say right.
    Good nine.

  26. Julian Burgess says:

    You can just use normal UTF-8 characters, no need to encode except for the 5 in XML, & < > " '

    © ☃ …

    https://en.wikipedia.org/wiki/Character_encodings_in_HTML#XML_character_references

  27. Allen says:

    Great article! This is a good resource for HTML codes: https://www.ascii.cl/htmlcodes.htm

  28. brill design says:

    The Em and En hypens I didn't know about! I've always used &ndash; for all dashes wherever they were… sorry, I mean &hellip;

    Thanks Mr, Spooner!

  29. Goncalo Espinha says:

    Thanks!! I just leveled up on the "content selection and customization" skill

  30. amzad says:

    U r rocking

  31. Chris says:

    Great post! Here is an extensive list of special characters for HTML and Flash. Handy so you don't have to memorize them all! https://www.designshifts.com/special-character-code-for-html-and-flash/

  32. Laura says:

    Yep, I'm guilty of crime 10. And I'll be honest, its just down to laziness! Great list Chris, this will make a useful quick reference page. Thanks.

  33. rohnn says:

    Crime 10.5 :
    using &ldquo; and &rdquo; when quoting in french.
    &laquo; and &raquo; should be used

  34. Matt says:

    A quick glance though an Oxford or Chicago style guide will explain the differences between the various dash lengths. Their uses have obvious functionalities which should be honoured, and not merely used for style.

    Why craft design and code and not the use of the written language?

  35. James says:

    You forgot;
    é – &eacute;
    à – &agrave;
    ö – &ouml;

  36. London Car Rental says:

    Good post, but I always check my code in the W3C Validator. If improve then continue my work otherwise i stoped.

  37. Henry says:

    Thanks, but I wonder why.

    Why use en dash really when – looks just fine? Who cares?

  38. Madeline says:

    Thanks for the list. I usually use an editor to fix those issues automatically but sometimes it doesn't so I use w3c to find those mistakes.

  39. AJ says:

    I blame my text editor for not converting those ;)

  40. anyulled says:

    I have my own code to replace those "crimes" with the correct entitites.

    Thanks a lot for your post.

  41. FlowerMountain says:

    That's great but what about CMSs ? How do you turn 3 dots into an &hellip; and make people use the right kind of dash?

    • FletcherAdams says:

      This is exactly what I was thinking. CMS's cannot be conditioned to replace characters with the correct entity.

      • Fen says:

        Well actually some editors (FCKEditor) do the work for you. If you don't have an RTF editor installed, you can always prune the input via PHP (preg_replace etc.).

  42. Ignacio Larrain says:

    What about uft8?
    Isn't it a valid alternative to not use html entities?

  43. Brad Czerniak says:

    What about just using the Unicode characters instead? Alt+0151 on the numeric keypad of an em dash —, Alt+0133 for an ellipsis… etc.

    • Avangelist says:

      If you use the unicode characters they will ultimately get converted by a parser and break.

      Basically – don't use them unless you can absolutely guarantee that your output is only going to be unicode based.

      • Chris Adams says:

        In 2000, that was a reasonable consideration but by now it's not worth doing anything other than fixing non-UTF-8-safe software. It's a simple, easy feature and it won't treat your global audience as second class.

  44. BerkshireKatie says:

    Very useful post. I often fall victim to the '&' incident!

  45. Greg McAusland says:

    Genius post mate, I'm embarrassed to say despite years in the industry im still typing …'s

    No more!

  46. Ilia says:

    I agree with most of these but…

    As someone for whom English is a second language I, frankly, find the existence of four different designations for a "line in the text" a bit insane. I mean I'm expected to remember the difference between what – a hyphen, two types of dashes and a minus – and that difference being only the length of the line. Yep, crazy :)

    • Avangelist says:

      That's understandable. But you have to remember that html is a language written in English so it carries with it certain idiosyncrasies.

      It gets worse when you start mixing the charsets.

  47. Avangelist says:

    Err… isn't the numeric value #000? or is that other than if you require the use of the ASCII equivalent?

    Another important thing to remember with these entities is that it also depends on your parser within your application, the charset specified and the MIME TYPE.

    A good start. Money symbols are another classic error you see everywhere. you'd think &pound; was obvious.

  48. Louis Gubitosi says:

    so obvious, yet so often overlooked. Great list of HTML characters, Chris!

  49. el says:

    And, of course, the characters don't appear properly in the comment.

    Curses!

  50. el says:

    Small correction: the straight quote and double straight quote&mdash;the non-curly characters&mdash;aren't supposed to be used in place of the prime and double-prime characters.

    Instead use &amp;prime; and &amp;Prime; to properly format 5&prime;10&Prime;.

    Another small niggle: The hyphen character doesn't replace the proper &amp;minus; character.

    Thanks for pointing out the importance of using the right character in the right place.

  51. Jamie says:

    Definatly just didnt read through my latest coding…

  52. Pete Fairhurst says:

    Crime 11: Using images to represent meaningful characters available in any browser you care to name.

  53. inPhocus Media says:

    This is definitely a good post, you make me want to go back and check my code and look for any of these errors.

  54. Saifu says:

    Very good post…very helpful…….thnk u chris..

  55. Tiffany Reed says:

    Awesome post, Chris. I am definitely guilty of a few of these… Oops! haha!

  56. Eric says:

    Thanks Chris, interesting post! This site usually helps me out with entities ;)
    https://www.symbolicode.com/

  57. Gonzo the Great says:

    Good reminders, I always check my code afterwards in the W3C Validator, because it always shows these "faults"! .. or should I write:
    &lsquo;faults&rsquo;

    Thanks for sharing, Cheers & Ciao ..

  58. Dan Moat says:

    Thanks for this Chris, I knew all of them but have to admit I have still continued to be guilty of a few – especially with dashes and quote marks which I always tend to get wrong even in print (where at least thankfully it's not my writing).

    The above hyphen being proof haha.

Please Share