Tools To Make Tag Soup Healthy


Jul. 12

6.28 K


Anybody who has ever attempted to make a web page knows what a tag soup is. It’s the innocent crime that most of us, unwantedly make. An ill-structured, invalid HTML file of your website design process is basically a tag soup. Since such a file never provides us with reliable output, it’s nothing but a soup of tag. The initial standards of web browsers were not equipped to parse the HTML files and so most of us ended up creating invalid HTML files or tag soups. Presently the web browsers have what we call the tag soup parser to detect and parse even an invalid HTML file.

A tag soup essentially refers to mistakes such as:

  • Incomplete tags
  • Mismatched tags
  • Improper styling(files lacking proper indentation)
  • Incorrect use of escape characters
  • Use of proprietary HTML extensions

Here is a small example of a very unhealthy and painful tag soup:



<title>Lets nurture</title


<h1> This is the case of mismatched tags</h2>

<img src=”letsnurture.jpg”, border=”5>



As a general practice, on receiving unexpected output from a file like that, one would go back to HTML file, manually scan the document and find the invalid or missing tags, correct them and try of run again. But in the case of long HTML files containing multiple web pages (which is usually the case), this is a rather long and tiring process.

HTML5 comes to rescue at this point. Web browsers that are HTML5 compatible are able to handle tag soups. HTML5 has both forward as well as backward compatibility in the sense that in addition to supporting HTML4 it also houses many of the new features. HTML5 has laid down rules for parsing the HTML parsing which was not present before.

Apart from HTML5, there are a couple of tools that help fix the tag soup. Let’s look at each one of them briefly:


Developed by Dave Raggett  of World Wide Web Consortium (W3C) is a library which has its source code written in ANSI C. It is a tool for a number of platforms.  Fixes provided by HTML tidy includes:

  • Correcting missing or mismatched tags
  • Add missing items such as quotations, escape character
  • Provide proper styling and indentation to HTML files
  • Reporting use of proprietary HTML extensions

Tag soup

Tag soup is a java library that parses HTML file.Although it is not as efficient as HTML tidy, it corrects the HTML file on the go. It does guarantee well-structured results: tags will wind up properly nested, default attributes will appear appropriately, and so on. It is the free and open source.

Beautiful soup

It is a python web library that turns the invalid HTML file for your website design into a parse tree. A Beautiful Soup constructor takes an XML or HTML document in the form of a string (or an open file-like object). It parses the document and creates a corresponding data structure in memory. If you give Beautiful Soup a perfectly-formed document, the parsed data structure looks just like the original document. But if there’s something wrong with the document, Beautiful soup uses heuristics to figure out a reasonable structure for the data structure.

Read more about how to convert PSD to HTML.

After a proper use of any of the above solutions one should have a properly structured file as follows:



<title>Lets nurture</title>



<h1> This is the case of mismatched tags</h1>

<img src=”letsnurture.jpg”, border=”5></img>



So go ahead and drink a healthy tag soup.

Let’s discuss more about Tag soup and website design aspects. Leave a message to us on our Facebook page – LetsNurture.


Lets Nurture
Posted by Lets Nurture

Blog A directory of wonderful things

Hybrid vs Native ? No More Confusion !!

Should I get my app developed using native or hybrid technology? If you are in IT industry, you would have come across this question more often. Non-technical entrepreneurs go crazy …

Using CSS through CSS Preprocessor

Any website at the basic layer is pure HTML – tags and styling. When internet started a few decades ago, HTML was only text. Gradually, it was possible to add …

Is Titanium Bold Enough To Drive The Cross Platform Mobile Application Development Movement

Here is hoping that you enjoyed our last post cause today we are back with another handy cross platform development tool. Yes we are going to get our feelers into …

Would you use HTML 5 on your mobile?

We all have seen some really amazing things happening on the internet. A collection of beautifully animated sites showing off a wide range of features. If you have seen all …

Importance of displaying telephone numbers to Improve Website’s User Experience

User Experience is always an important part of any Website development project. It is always important to try something and tweak your html to improve website development. Phone Numbers and …

Pains Of Do It Yourself Websites

Sometimes doing it right and exactly right makes the entire difference. Sure you have made your website look decent with the help of DIY (do It Yourself) websites that claim …


Have an !dea or need help with your current business?

We use cookies to give you tailored experiences on our website.