cdybedahl: (Default)
[personal profile] cdybedahl

[livejournal.com profile] osanna said so what's the difference between html and xml?

Well. Largish question, and not entirely easy to answer in a way that's understandable for non-technical people. But I'll give it a shot.

Let's begin with what HTML is. HTML is a markup language. That means that it's a language used to put marks in text, that provides some meaning on structure to the text. You can, as you probably know, mark bits of text as being a heading, or a paragraph, or an item in a list, or lots of other things.

So where does that put things like <IMG> and <FONT> that don't really say what a bit of text is, you say? Well, they don't really belong. They weren't there in the beginning. They appeared as the web grew in popularity and people started wanting not only to say what a bit of text was but also specify what it would look like to the reader. More and more such things were added, until we get to the mess we use today: HTML version 4.01.

XML comes from an attempt to clean up the mess. The main problem with HTML is that it is one tool that gets used for everything, and while you can pound in a screw with a hammer it's not really a good idea. Sometimes you need a screwdriver. So, rather than try to do the Super-Mega-Tool From Hell, the people who designed XML took a step back and made XML a tool with which to build other tools. It is a markup meta-language.

In practical terms, XML looks very much like HTML where you make up the tags yourself. "<kitchensink>Example</kitchensink>" is a valid XML fragment. Unlike HTML, an XML document have to follow a few fairly simple rules:

  1. All tags must balance. For every tag, there must be a closing tag. <IMG> must be followed by </IMG>. This is not as bad as it may sound, since there is a special shorthand for a tag pair with nothing in it. The IMG example here can also be written as <IMG/>.
  2. All tags must nest properly. You can't write <b><i>Something</b></i>, it must be <b><i>Something</i></b>.
  3. All tag attributes must have values, and those values must be surrounded by quotes. You can't say <hr noshade>, instead you have to say <hr noshade="yes">.

There are a few more, but they quickly get into the territory where about the only people who have to care are those writing programs that use XML. An XML file that follows the rules is called a well-formed XML document. If a document isn't well-formed, it's not XML.

That's it for the form. What about the content? If you can make up the tags yourself, how does anything know what to do with it?

That's a question with two answers. The first answer is "DTD". The second answer is "CSS". Let's take one at a time.

A DTD, or Document Type Description, describes what tags are allowed in a document and in which relation to each other they may be placed. The DTD I use for my fanfic, for example, specifies that a <para> element may contain <line> elements, and that <line> elements may contain either pure text or <em> tags. <em> tags in their turn may only contain pure text. An XML document that adheres to the rules in a certain DTD is said to be valid according to that DTD. So my fanfic files are valid according to my fanfic DTD, but they sure aren't valid according to the XHTML DTD. The point of having a document conform to a DTD is that it becomes easy to process it by computer. It's also a help for the writer of the document, since the program you use to write the document can complain if you forget something mandatory or put stuff in an order that doesn't make sense.

A Cascading Style Sheet, on the other hand, is a way to hang formatting information on the tags of an XML document. By providing a CSS file for your XML document, you can have sufficiently clever browsers display nicely formatted versions of your document to the viewers. It doesn't matter that you made up all the tags yourself, as long as you tell the browser that the content of a <kitchensink> tag should be displayed in a grey box with red borders using ten-point Times Roman, then the browser knows what to do with it. If you're using Mozilla or one of its children (Netscape 6/7, Galeon, Phoenix), you can look at one of my fics and see how it works. Follow the link to see the rendered version, then do "View Source" to see the actual XML code. It works a little in IE5+, but not well.

So. To return to the original question: The difference between XML and HTML is that you can use XML to build something that is very much like HTML. Which has been done, of course, and the result is called XHTML. XHTML is almost exactly what you get if you apply the three XML rules I gave above to plain old HTML 4.01. Except that XHTML is much easier to handle with a program, and it is possible to extend it with new functionality without messing up the stuff that's already there.

I hope that answers your question :-)

(no subject)

Date: 2002-11-13 09:09 pm (UTC)
From: [identity profile] osanna.livejournal.com
Ah! Okay, I get it. XML sounds like it would ultimately be more user efficient if one understands the bascis of html coding.

(no subject)

Date: 2002-11-14 08:57 pm (UTC)
minim_calibre: (Default)
From: [personal profile] minim_calibre
See: second-to-last-or-so LJ entry. It should explain the tears of laughter streaming down my face.

(no subject)

Date: 2002-11-14 10:51 pm (UTC)
ext_12692: (Default)
From: [identity profile] cdybedahl.livejournal.com
If you mean the Schrödinger's Lesbian entry, it was one of the first I read after writing all this. It was quite surreal.

(no subject)

Date: 2002-11-14 11:26 pm (UTC)
minim_calibre: (Default)
From: [personal profile] minim_calibre
That would be the one.

I work with XML and .ascx files all day, and they seem to have murdered my brain. Which is to say, the high point of my day was reducing the page weight by a whole 1k after I stripped out a hell of a lot of white space.

Which was pretty much the point at which I realized I should get out more.

(no subject)

Date: 2002-11-15 05:59 am (UTC)
ext_12692: (Default)
From: [identity profile] cdybedahl.livejournal.com
This "out" thing, does that involve the very large room with the grey ceiling that it falls water from?

(no subject)

Date: 2002-11-15 07:24 am (UTC)
minim_calibre: (Default)
From: [personal profile] minim_calibre
That would be the one.

Profile

cdybedahl: (Default)cdybedahl

July 2021

S M T W T F S
    123
45678910
11121314151617
1819 2021222324
25262728293031

Style Credit

Expand Cut Tags

No cut tags
Page generated Jun. 7th, 2025 09:07 pm
Powered by Dreamwidth Studios