Feb 03 2010

Fix html tags, close tags, repair bad quotes and more

Category: Html,PhpGiulio Pons @ 2:31 pm

This class can solve many problems coming from user generated html content or to fix html content before making some hard work with your bots! (It’s specially usefull for web sites without the Html Tidy module of PHP).

Hre is a quick list of the magic things it can do.

  1. delete closed tags without their opening tag
  2. fix open tag without close, closing them automatically
  3. check bad nesting and fix them (if you have a bold inside a bold… or a paragrah that contains a table…)
  4. fix bad quotes in attributes (open quotes where missing…)
  5. merge different styles attributes in the same tag
  6. remove html comments
  7. remove empty tags and more bad tags

It works ina complex way since it analyzes the html code char by char and search for tags. When a tag is found start the work of cleaning attributes, then store data found in a matrix and search for the closing tags.
The data saved in the matrix are later used to re-build the correct fixed html.

EXAMPLE:
It’s very simple to use, suppose you have a variable with the dirty html:

$a = new HtmlFixer();
$clean = $a->getFixedHtml($dirty_html);

You can download the class from the HTML FIXER page.

  • Share/Bookmark

Related posts:

  1. HTML fixer
  2. Reading mp3 informations with php (id3 tags)
  3. Sending HTML emails with attachment with PHP
  4. Correct headers to download a CSV from PHP
  5. Truncate string preserving some words in PHP

Tags: , , , , , ,

3 Responses to “Fix html tags, close tags, repair bad quotes and more”

  1. Savita says:

    Hi,

    This class is really helpful, just need to point some small issues…
    1) php short code is user <?, in place need to use this debug is false but still it shows all the debugging on the screen.

  2. admin says:

    Thank you Savita, probably I’ve uploaded the wrong file. I’ll fix it today!

  3. Reflexões says:

    Wow!!!

    This works perfeclty!!! fix all HTML tags….
    I lost much time fixing manually wrong HTML codes posted by users at a custom blog…

    Thx

Leave a Reply