March 10, 2010

This small function receive a text as input and returns an html text with links if the source text contains urls (http://www… but also ftp://… and every other protocol), emails, twitter’s usernames (with @ at the beginning) and also twitter tags (with # at the beginning).
Those replaces are possible with the php preg_replace function:

function parse_twitter($t) {
	// link URLs
	$t = " ".preg_replace( "/(([[:alnum:]]+:\/\/)|www\.)([^[:space:]]*)".
		"([[:alnum:]#?\/&=])/i", "<a href=\"\\\\" target=\"_blank\">".
		"\\\</a>", $t);

	// link mailtos
	$t = preg_replace( "/(([a-z0-9_]|\\-|\\.)+@([^[:space:]]*)".
		"([[:alnum:]-]))/i", "<a href=\"mailto:\\">\</a>", $t);

	//link twitter users
	$t = preg_replace( "/ +@([a-z0-9_]*) ?/i", " <a href=\"\\" target=\"_blank\">@\</a> ", $t);

	//link twitter arguments
	$t = preg_replace( "/ +#([a-z0-9_]*) ?/i", " <a href=\"\\" target=\"_blank\">#\</a> ", $t);

	// truncates long urls that can cause display problems (optional)
	$t = preg_replace("/>(([[:alnum:]]+:\/\/)|www\.)([^[:space:]]".
		"</", ">\...\\<", $t);
	return trim($t);


I'm a software engineer, an everyday web developer and a maker. I usually build sites with PHP, within or without WordPress. I build Internet of Things with Arduino and ESP8266. I'm the founder of and and I'm actually the Chief Technical Officer of Better Days web agency.

Comments on “PHP parse url, mailto, and also twitter’s usernames and arguments”

6 thoughts

  1. php html says:

    […]A set of regular expressions to retrieve URLs, emails, twitter’s usernames and argument[…]

  2. Aiko says:

    Another great script Giulio. Just having a few “problems” and regular expressions are always confusing me so I don’t know how to fix it.

    I made a test page at so you can see what I mean. If I put two or more hash tags or twitter usernames right after each other the script seems to skip every second one.

    Maybe it’s just a small issue?

  3. Aiko says:

    Ah never mind I found a solution that works:

    +@([a-z0-9_]*) ?/i”

    remove the space into


    Now all works correct. Removed the testpage.

  4. Giulio Pons says:

    Mmm are you sure? I’ve not tested it. That space is followed by a ? which means that this expression matches even if the previous space there isn’t. If you remove the space, the ? means that the expression matches even if the block ([a-z0-9_]*) there isn’t. mmm.

  5. Aiko says:

    As said: regular expressions are not my thing :-) so I don’t really understand what I’m doing – it’s mainly a case of trial and error.

    But, it seems to work the way I made the changes. I’ve put the testpage back online so you can see for yourself:

    I don’t mind an alternative solution :-)

  6. Giulio Pons says:

    Ok, it works! That’s enaugh! :-)

Comments are closed


