Mar 10 2010

PHP parse url, mailto, and also twitter’s usernames and arguments

Category: Phpadmin @ 9:41 pm

This small function receive a text as input and returns an html text with links if the source text contains urls (http://www… but also ftp://… and every other protocol), emails, twitter’s usernames (with @ at the beginning) and also twitter tags (with # at the beginning).
Those replaces are possible with the php preg_replace function:

function parse_twitter($t) {
	// link URLs
	$t = " ".preg_replace( "/(([[:alnum:]]+:\/\/)|www\.)([^[:space:]]*)".
		"([[:alnum:]#?\/&=])/i", "<a href=\"\\1\\3\\4\" target=\"_blank\">".
		"\\1\\3\\4</a>", $t);

	// link mailtos
	$t = preg_replace( "/(([a-z0-9_]|\\-|\\.)+@([^[:space:]]*)".
		"([[:alnum:]-]))/i", "<a href=\"mailto:\\1\">\\1</a>", $t);

	//link twitter users
	$t = preg_replace( "/ +@([a-z0-9_]*) ?/i", " <a href=\"http://twitter.com/\\1\" target=\"_blank\">@\\1</a> ", $t);

	//link twitter arguments
	$t = preg_replace( "/ +#([a-z0-9_]*) ?/i", " <a href=\"http://twitter.com/search?q=%23\\1\" target=\"_blank\">#\\1</a> ", $t);

	// truncates long urls that can cause display problems (optional)
	$t = preg_replace("/>(([[:alnum:]]+:\/\/)|www\.)([^[:space:]]".
		"{30,40})([^[:space:]]*)([^[:space:]]{10,20})([[:alnum:]#?\/&=])".
		"</", ">\\3...\\5\\6<", $t);
	return trim($t);
}
  • Share/Bookmark

Tags: , , , ,


Feb 03 2010

Fix html tags, close tags, repair bad quotes and more

Category: Html, Phpadmin @ 2:31 pm

This class can solve many problems coming from user generated html content or to fix html content before making some hard work with your bots! (It’s specially usefull for web sites without the Html Tidy module of PHP).

Hre is a quick list of the magic things it can do.

  1. delete closed tags without their opening tag
  2. fix open tag without close, closing them automatically
  3. check bad nesting and fix them (if you have a bold inside a bold… or a paragrah that contains a table…)
  4. fix bad quotes in attributes (open quotes where missing…)
  5. merge different styles attributes in the same tag
  6. remove html comments
  7. remove empty tags and more bad tags

It works ina complex way since it analyzes the html code char by char and search for tags. When a tag is found start the work of cleaning attributes, then store data found in a matrix and search for the closing tags.
The data saved in the matrix are later used to re-build the correct fixed html.

EXAMPLE:
It’s very simple to use, suppose you have a variable with the dirty html:

$a = new HtmlFixer();
$clean = $a->getFixedHtml($dirty_html);

You can download the class from the HTML FIXER page.

  • Share/Bookmark

Tags: , , , , , ,


Nov 24 2009

Truncate string preserving some words in PHP

Category: Phpadmin @ 11:11 am

When you search in Google for a string, Google highlights with bold text the words you’ve searched in the results list. You can use this function to do the same thing in PHP.
It splits the text to search into an array of words and then searches each word into this array. It also marks some additional words before and some words after to let the user see some words near the searched text, and re-build the output string with the searched words and the additional words to keep.

// $h = haystack string with the text
// $n = needle string with words to search
// $w = number of additional words to keep
// $tag = tag to use to highlight the results
function truncatePreserveWords ($h,$n,$w=5,$tag='b') {
	$n = explode(" ",strip_tags($n));	//needles words
	$b = explode(" ",strip_tags($h));	//haystack words
	$c = array();						//array of words to keep/remove
	for ($j=0;$j<count($b);$j++) $c[$j]=false;
	for ($i=0;$i<count($b);$i++)
		for ($k=0;$k<count($n);$k++)
			if (stristr($b[$i],$n[$k])) {
				$b[$i]=preg_replace("/".$n[$k]."/i","<$tag>\\0</$tag>",$b[$i]);
				for ( $j= max( $i-$w , 0 ) ;$j<min( $i+$w, count($b)); $j++) $c[$j]=true;
			}
	$o = "";	// reassembly words to keep
	for ($j=0;$j<count($b);$j++) if ($c[$j]) $o.=" ".$b[$j]; else $o.=".";
	return preg_replace("/\.{3,}/i","...",$o);
}

// example:
$s = truncatePreserveWords("this is a long long text, we have to truncate this text and not another one. This is the example!", "long example");

Watch an example here (source)

  • Share/Bookmark

Tags: , , ,