HTML Style Guide
How to write HTML that's correct, readable, and elegant.

Table of Contents

  1. Preamble
  2. Introduction
  3. Consistency
  4. The case: UPPER vs. lower.
  5. To "quote" or not to quote.
  6. Open, open, close, close.
  7. <P>
  8. <BR>
  9. Tables
  10. Forms
  11. Pixel GIFs
  12. The End

Preamble
With the current influx of new staff, the glimmering potential for new employees, and the insufficient time spent training veteran workers, I've decided to write a document detailing a series of HTML coding conventions. (Also, I'm planning on turning this into a book.)

Introduction
In the world of computer programming, one of the major problems encountered is the dilemma of maintenance. Specifically, what happens when a programmer works on a piece of software, but then leaves the company? If changes need to be made, who can make them? All the knowledge about the inner workings and details of the code are essentially gone forever.

Computer code, when formatted arbitrarily and without comments, is surprisingly difficult for people to glance through the lines of code and easily discover where needed additions or corrections should go. Even with standard conventions and styles, this is still a formidable task. HTML is not as complex to decipher as a computer programming language like C or C++; nevertheless it is still easy to create a markup which will befuddle all but the most experienced of people, as well as yourself.

At Student.Net Publishing, many people are involved in the process of publishing an article. Freelancers write the article, sectional editors suggest changes and place the article into ADAPS, managing editors make further revisions, and the production staff adds in fancy block quotes and headlines. And this is just the life cycle of a typical article, to say nothing of special pieces and complex packages. While ADAPS admirably handles the exchange of the document between each party, no piece of software can ensure that when a person tries to edit an article, they understand why previous people have chosen to format the document in their own special ways.

In order to fix this problem, here are solid guidelines (along with the best of my tips) that future articles, polls, features, section fronts, and the like should uphold. I hope you'll find them helpful and gain a stronger understanding of the foundation of HTML as you read through the guide.

Consistency
Without a doubt, the most important rule in coding a HTML document is to be consistent. When I review a person's HTML, the first thing I check is: "Does this person mark up each article in the same manner?" This is crucial! If you can uphold this rule, you are well on your way on becoming a great HTML coder. Why? Because I (and you) will be able to fix your mistakes quickly! If you do everything the same way, it makes it pretty simple for you to scan down a page looking for things that are "strange." If an article is coded haphazardly, you need to check through every single character for errors. This is a slow process and highly prone to error.

If you follow my suggestions, I think you'll discover that the code you write will be significantly more readable. Also, when your co-workers come to edit your articles, they'll understand what you're doing and be able to make changes even if you're out of the office.

The case: UPPER vs. lower.
While HTML is case-insensitive, you should always use upper case letters when coding, with the only exceptions being ALT text, URLs, and other case sensitive names, like fonts. Since the majority of a page's text is in mixed case, upper case HTML tags will stand apart visually from the content. To wit:

<FONT SIZE=7>Snow White and the Seven Dwarfs</FONT><BR>
<FONT FACE="Arial, Helevtica" SIZE=3><I>Looking-glass, looking-glass, on the wall, who in this land is the fairest of all?</I></FONT><BR>
<P>
compared to
<font size=7>Snow White and the Seven Dwarfs</font><br>
<font face="Arial, Helevtica" size=3><i>Looking-glass, looking-glass, on the wall, who in this land is the fairest of all?</i></font><br>
<p>
or, heaven forbid,
<Font Size=7>Snow White and the Seven Dwarfs</Font><Br>
<Font Face="arial, helevtica" Size=3><i>Looking-glass, looking-glass, on the wall, who in this land is the fairest of all?</I></Font><Br>
<P>
Using all capital letters helps create HTML that is readable and easy to edit.

Open, open, close, close.
This one is short, but important. If you have two open tags, for example you made some text both bold and italic, you should close the tag opened second tag first. That's:

<B><I>And all the straw the silly pig had heaped against some thin poles, fell down in the great blast.</I></B>
not
<B><I>And all the straw the silly pig had heaped against some thin poles, fell down in the great blast.</B></I>

To "quote" or not to quote.
You've probably seen people write attributes in two different ways: VALUE=FOO & VALUE="FOO". According to Section 3.2.2 (Attributes) of the HTML 4.0 specification:

By default, SGML [The parent markup language of HTML] requires that all attribute values be delimited using either double quotation marks (ASCII decimal 34) or single quotation marks (ASCII decimal 39)....

In certain cases, authors may specify the value of an attribute without any quotation marks. The attribute value may only contain letters (a-z and A-Z), digits (0-9), hyphens (ASCII decimal 45), and periods (ASCII decimal 46). We recommend using quotation marks even when it is possible to eliminate them.

Accordingly, I recommend using double quotes for all attributes, as nobody ever uses single quotation marks. The only time you can skip the double quotes is when the attribute value (that's the right side of the pair) is, in addition to the above requirements, a non-case sensitive token. By token, I mean one of the values prespecified as part of the HTML specification. So, for an <IMG> tag, you can write ALIGN=LEFT, RIGHT, ABSMIDDLE, or whatever of the various ALIGN tags excites you without double quotes. But, when you're giving the SRC attribute, you have to write SRC="foo.gif" and not SRC=foo.gif even if you're only using letters, numbers, and periods.

This is mainly to protect against problems with case sensitivity. ALIGN=LEFT and ALIGN=left have an identical meaning, but SRC=FOO.GIF and SRC=foo.gif can have a very different meaning. By putting the value in double quotes, you provide an extra measure of protection against confusion. Additionally, since most times the tags like SRC requires their values to be inside double quotes, you should always place them there for consistency. For instance, SRC=/foo.gif would be invalid HTML, you have to write SRC="/foo.gif". So, it's best to get into habits that won't cause trouble.

<P>
Between paragraphs of text, you should place one <P>, as such:

The Father took some strong cord and his hatchet, and ran quickly to the cottage, and got there just in time to catch the wolf, so he tied him up with cord, and killed him. Then he took Little Red Riding Hood home to her Mother, and oh, how glad she was to be there, at home, where she knew she was quite safe.
<P>
And the woodman took the skin of the wolf and made it into a hearthrug, and every time Little Red Riding Hood saw it she thought of her adventure, and so she tried not to forget what Mother told her, and was good and happy ever after.
This mimics the "two returns" many people put between paragraphs. It also makes it easy for someone to scan down a page looking to edit a sentence in the, say, fourth paragraph. Putting the <P> at the end of the paragraph is bad for a variety of reasons, besides scanability.
  1. Since lines sometimes scroll off the window in the editor, it makes it difficult to check if the <P> is in the document. You end up scrolling back and forth like you're watching a tennis match with very long rallies.
  2. Semantically, just a you put <I> and </I> tags around a piece of text you wish to italicize, you put <P> and </P> tags around a paragraph. After all, that's what the P stands for, right? But, since it's such a pain to do this, you're allowed to skip the </P>. The closing </P> tag is inferred to exist by the browser whenever it encounters a <P>. So, if anything, you should be putting the <P> at the beginning of each paragraph, not the end. With the kingdom of cascading style sheets closing in around us, the sloppy usage of <P>s will be revealed — don't bring shame upon yourself and your company!

    Note that you can use both opening and closing <P> tags, but since the </P> tag usually inserts a space similar to the <P>, you'll eventually run into instances where you wish it didn't. So, you'll be stuck with either space where you don't want it, or you'll have to omit the </P> for that paragraph. Quick! What's the most important rule of HTML coding? Okay then. Now you know why I don't tell everyone to put </P>s at the end of each paragraph.

  3. Also, note that I said you should put one <P> between paragraphs. For the record, one does not equal two, or three, or four, or whatever number the Microsoft Word HTML exporter decides one should equal at that given second. One equals one, don't be caught making a first grader's addition error, or you'll be give a corporate number line to count with.

<BR>
If you have a two lines of text that are separated by a <BR>, for example our headline and subhead, place the <BR> at the end of the line.

Now, I hear you asking: "Wait! Didn't you just tell me that putting the <P> at the end of the paragraph was bad? How come its okay for the <BR>?

Well, that's a very good question. Here are my answers:

  1. Since there's no space between the lines separated by a <BR> when the browser renders the file, why should there be one when you look at the text?
  2. The <BR> is a "empty element" in the world of HTML. Empty elements (like the <HR> and the <BR>) don't have any content, so they don't require closing tags. This avoids the whole <P> and </P> problem we had before.
  3. While HTML browsers are supposed to ignore line feeds immediately before and after a tag when displaying a page, they don't. As a result, if you're trying to line up a bunch of images on top of each other, or some other acrobatic feat of daring HTML skill,
    <IMG SRC="top.gif" ALT="top" HEIGHT=10 WIDTH=10><BR>
    <IMG SRC="bottom.gif" ALT="bottom" HEIGHT=10 WIDTH=10>
    does not display the same as
    <IMG SRC="top.gif" ALT="top" WIDTH=10 HEIGHT=10>
    <BR>
    <IMG SRC="bottom.gif" ALT="bottom" WIDTH=10 HEIGHT=10>
    To wit:
    Me, uh... Dave!
    Me, uh... Dave!
    and Me, uh... Dave!
    Me, uh... Dave!
    See the thin white line between the images on the right? If you don't, you're either blind or using a good browser. Let's just say that if you can't see the space, other people can and you should always make sure that people see what you want them to.
  4. Besides causing problems with image alignment, moving the <BR> to the next line also disturbs text alignment when coupled with the <FONT> tag. As before:
    <FONT SIZE="-1">Peter, Peter, pumpkin eater,<BR>
    Had a wife and couldn't keep her.<BR>
    He put her in a pumpkin shell<BR>
    And there he kept her, very well.</FONT><BR>
    does not display the same as
    <FONT SIZE="-1">Peter, Peter, pumpkin eater,<BR>
    Had a wife and couldn't keep her.<BR>
    He put her in a pumpkin shell<BR>
    And there he kept her, very well.</FONT>
    <BR>
    Look:
    Peter, Peter, pumpkin eater,
    Had a wife and couldn't keep her.
    He put her in a pumpkin shell
    And there he kept her, very well.

    and Peter, Peter, pumpkin eater,
    Had a wife and couldn't keep her.
    He put her in a pumpkin shell
    And there he kept her, very well.

Tables
Tables can become especially messy, especially when you use them in ways they weren't intended. So, let me spread a few lessons I have learned from my battles.

  1. The browser won't begin to render cells within a table until it has parsed the entire table. What does this mean? Well, in our case, since most of our templates force you to place your content within a <TD> cell, you shouldn't put a lot of hard-to-render material inside of them. Large pictures and movies, won't slow the immediate display of the page as much as many(nes(nested)ted)many tables. This leads us to:
  2. Nested tables are bad and evil! They cause all sorts of browser conniptions and, besides, most of the time, there is a much simpler (not to mention faster) way to accomplish your goal. If you try to put more than 2 or 3 tables inside another, your page will take so long to load, most people will think their browser has crashed.
  3. Forgetting put close your table tags will cause your page to display one of two ways.
    1. As a completely blank screen.
    2. As a giant messy wierded out acid trip.
    Both are ugly. While technically, <TD>s and <TR>s should imply a closing tag, like the <P> does, Netscape doesn't believe this and, therefore, you shouldn't either.
  4. In order to allow you to ensure you don't forget any tags, you should format your tables nicely. This is the best way to do it:
    <TABLE CELLPADDING=0 CELLSPACING=0 BORDER=0 WIDTH=100%>
    <TR>
        <TD>to-day I bake,</TD>
        <TD>to-morrow brew,</TD>
    </TR>
    <TR>
        <TD COLSPAN=2>the next I'll have the young queen's child.</TD>
    </TR>
    <TR>
        <TD>Ha, glad am I that no one knew</TD>
        <TD>that Rumpelstiltskin I am styled.</TD>
    </TR>
    </TABLE>
    Just as how putting a <P> on a line by itself helps you skim your HTML for paragraphs, indenting your <TD>s, has the same effect. From an aesthetic point of view, I've often wanted to indent twice, once for each <TR> and again for a <TD>. Unfortunately, this causes your document to drift across the page like it's desperately running away from the left margin. When you realize this current of tabs only increases with each suitable, you'll quickly realize this isn't a great idea. Look:
    <TABLE>
        <TR>
            <TD>Is your name Conrad? No. Is your name Harry? No.</TD>
            <TD><TABLE>
                <TR>
                    <TD>Perhaps your name is Rumpelstiltskin?.</TD>
                    <TD>The devil has told you that!</TD>
                </TR>
            </TABLE></TD>
        </TR>
    </TABLE>
    Enough said.

Forms
Argh! Forms will drive you crazy if you're the type of person that wants to nestle a form gently within your document without wanting some extra space before and after the form. If you just place a form within a document, you end up with a page that looks like this:

Type in your wish and it will come true
Isn't this nice?
You can suppress the opening space by hiding the form in a table, but this won't work for the second space.
Type in your wish and it will come true
Isn't this nice?
So, you may have to resort to some stomach churning HTML.
Type in your wish and it will come true
Isn't this nice?
Type in your wish and it will come true
<TABLE CELLPADDING=0 CELLSPACING=0 BORDER=0>
<TR>
<TD><FORM METHOD=GET ACTION="kazam.cgi"><INPUT TYPE="text" NAME=""> <INPUT TYPE="submit" VALUE="Grant my wish."></TD>
</TR>
</TABLE>
Isn't this nice?
</FORM>
Blech! Don't tell anybody I told you this, the HTML police may come and arrest me.

Pixel GIFs
It's really not nice to sling invisible GIFs all over your page to provide you with spacing.

The End
That's it for now.


adam trachtenberg
adam@student.net
may twenty seven, nineteen ninety eight