PHP parse url, mailto, and also twitter’s usernames and arguments

This small function receive a text as input and returns an html text with links if the source text contains…

Marzo 10, 2010

This small function receive a text as input and returns an html text with links if the source text contains urls (http://www… but also ftp://… and every other protocol), emails, twitter’s usernames (with @ at the beginning) and also twitter tags (with # at the beginning).
Those replaces are possible with the php preg_replace function:

function parse_twitter($t) {
	// link URLs
	$t = " ".preg_replace( "/(([[:alnum:]]+:\/\/)|www\.)([^[:space:]]*)".
		"([[:alnum:]#?\/&=])/i", "<a href=\"\\1\\3\\4\" target=\"_blank\">".
		"\\1\\3\\4</a>", $t);

	// link mailtos
	$t = preg_replace( "/(([a-z0-9_]|\\-|\\.)+@([^[:space:]]*)".
		"([[:alnum:]-]))/i", "<a href=\"mailto:\\1\">\\1</a>", $t);

	//link twitter users
	$t = preg_replace( "/ +@([a-z0-9_]*) ?/i", " <a href=\"http://twitter.com/\\1\" target=\"_blank\">@\\1</a> ", $t);

	//link twitter arguments
	$t = preg_replace( "/ +#([a-z0-9_]*) ?/i", " <a href=\"http://twitter.com/search?q=%23\\1\" target=\"_blank\">#\\1</a> ", $t);

	// truncates long urls that can cause display problems (optional)
	$t = preg_replace("/>(([[:alnum:]]+:\/\/)|www\.)([^[:space:]]".
		"{30,40})([^[:space:]]*)([^[:space:]]{10,20})([[:alnum:]#?\/&=])".
		"</", ">\\3...\\5\\6<", $t);
	return trim($t);
}

Author

PHP expert. Wordpress plugin and theme developer. Father, Maker, Arduino and ESP8266 enthusiast.

Comments on “PHP parse url, mailto, and also twitter’s usernames and arguments”

6 thoughts

  1. php html ha detto:

    […]A set of regular expressions to retrieve URLs, emails, twitter’s usernames and argument[…]

  2. Aiko ha detto:

    Another great script Giulio. Just having a few “problems” and regular expressions are always confusing me so I don’t know how to fix it.

    I made a test page at http://blog.atgp.nl/parsetest.php so you can see what I mean. If I put two or more hash tags or twitter usernames right after each other the script seems to skip every second one.

    Maybe it’s just a small issue?

  3. Aiko ha detto:

    Ah never mind I found a solution that works:

    +@([a-z0-9_]*) ?/i”

    remove the space into

    +@([a-z0-9_]*)?/i”

    Now all works correct. Removed the testpage.

  4. Giulio Pons ha detto:

    Mmm are you sure? I’ve not tested it. That space is followed by a ? which means that this expression matches even if the previous space there isn’t. If you remove the space, the ? means that the expression matches even if the block ([a-z0-9_]*) there isn’t. mmm.

  5. Aiko ha detto:

    As said: regular expressions are not my thing :-) so I don’t really understand what I’m doing – it’s mainly a case of trial and error.

    But, it seems to work the way I made the changes. I’ve put the testpage back online so you can see for yourself:
    http://blog.atgp.nl/parsetest.php

    I don’t mind an alternative solution :-)

  6. Giulio Pons ha detto:

    Ok, it works! That’s enaugh! :-)

Comments are closed

Recommended

How many times a web link has been shared on Twitter

Twitter share button and Facebook share button are the most used buttons to share links on Internet. You can read…

Ottobre 19, 2012

Fix html tags, close tags, repair bad quotes and more

This class can solve many problems coming from user generated html content or to fix html content before making some…

Febbraio 3, 2010

New version of Mini Bots PHP Class (v.1.4)

I’ve added three more bots to the Mini Bots Php Class, now the version number is 1.4 and it has…

Gennaio 20, 2010

OnKeyUp Fix Alphanumerical Chars

When you have an html form and you want only alphanumerical [a-z0-9] chars in your input, you can use this…

Novembre 10, 2009

React links

Collection of links and notes while approaching React.js, next.js and related topics.

Dicembre 14, 2022

Arduino links

Collection of links about projects and arduino community

Novembre 1, 2022