XHTML Archives

Converting HTML to XML

HTML and XML are two distinct concepts in the tree of markup languages. You can’t exactly replace HTML with XML . XML can be viewed as a generalized form of HTML, but even that is imprecise. You mainly use HTML to display data, and XML to carry(or store) the data.

Incoming search terms:

  • XHTML|Web eCommerce iOS Android ArduinoandSEO

There several reasons and scenarios which arise to control the access of the web robots or web crawlers or simple spiders, to our website. Like  Google-bot (Google Spider) visiting our website,  spam bots too will visit. Spam bots usually visit and collect private information from our website. When a robot crawls our website it uses a considerable amount of the website’s bandwidth too! It is easy to control robots by disallowing the access of the web robots to our website through the usage of a simple ‘robots.txt’ file.

Creating a robots.txt:

Open a new File in any Text Editor Like Notepad.

The rules in the robots.txt file are entered in a ‘field’: ‘value’ pair.
<field>:<value>

<field>

Can have possible two values: allow or disallow for a particular URL

<value>

An URL or URI that the access or rule is specified.

Eg.:

User-agent: *
Disallow: /private-folder

Samples:

If  we want to exclude all the search engine robots from indexing our entire website , then do enter the following into the robots.txt file:

User-agent: *
Disallow: /

If we want to exclude all the bots from a certain directory within our website, we would write the following:

User-agent: *
Disallow: /aboutme/

For multiple directories, we add on similar Disallow values

User-agent: *
Disallow: /aboutme/
Disallow: /stats/

Access to specific documents can also be specifoed.

User-agent: *
Disallow: /myFolder/name_me.html

If we want to disallow a specific search engine bot from indexing our website,

User-agent: Robot_Name
Disallow: /

Advantages of Using Robots.txt:

  • Avoid Wastage of Server Resources
  • Save Bandwidth
  • Removes Clutter and complexity from Web Statistics and more smooth anlytics
  • Refusing a specific Robots

Common Errors and mistakes in robots.txt:

  1. It is not Guaranteed to Work
    Instead use .htaccess file in combination with .htpasswd
  2. It is not a method to protect Secret Directories
    Any bot or agent can access robots.txt. Don’t put any secret directories or files in it.
  3. Only One Directory/File per Disallow line

The rules are singular in sense. One rules or fields contains only one value.

What is XHTML

XHTML

XHTML stands for EXtensible HyperText Markup Language.

Features
• Stricter and cleaner version of HTML.
• Combination of HTML and XML (EXtensible Markup Language).
• XHTML consists of all the elements in HTML 4.01, combined with the strict syntax of XML.

Difference between HTML and XHTML

• XHTML elements must be properly nested

Eg: <b><i>This text is bold and italic</i></b>

• XHTML elements must always be closed

Eg: <p>This is another paragraph</p>

A break: <br /> :

• XHTML elements must be in lowercase

Eg: <body>

<p>This is a paragraph</p>

</body>

• XHTML documents must have one root element

Eg:

Mandatory XHTML Elements

All XHTML documents must have a DOCTYPE declaration. The html, head, title, and body elements must be present.

This is an XHTML document with a minimum of required tags:

<!DOCTYPE Doctype goes here>

XHTML vs HTML

The main difference between XHTML and HTML is that XHTML is very very strict!

No carelessness will be tolerated here.

All tags must be closed! Empty as well as non empty elements must be closed. For example:

<p>This is a paragraph</p>

<br /> break (Note: please don’t forget the space before the ‘/’ as it will create compatibility issues in certain cases!)

XHTML is a combination of HTML and the strict syntax of XML.

Main points to remember regarding XHTML are:

  1. All elements must be properly closed and nested
  2. All elements must be in lowercase
  3. All documents must have a root element
  4. DOCTYPE declaration, <head> and <body> sections must be specified.

Have a xhtml page ready and wondering if you have got everything right? Simply use the following validator…Easy!

http://validator.w3.org/