Nov 01 2011

PHP code to check if remote mp3 exists

Category: Php,Spiders & web botsGiulio Pons @ 4:30 pm

Hi, I’ve a big table with thousands of mp3 links. Sice these links come from an old database, many of them are old and expired. Here is a function that I’ve included in my Minibots Class. The function uses checkdnsrr to verify the domain and then uses curl to fetch the mp3 file and verify the mime type. I’ve used checkdnsrr first because it seems faster.

function checkMp3($url) {
	if (!function_exists("curl_init")) die("getHttpResponseCode needs CURL module, please install CURL on your php.");
	$a = parse_url($url);
	if(checkdnsrr(str_replace("www.","",$a['host']),"A") || checkdnsrr(str_replace("www.","",$a['host']))) {
		$ch = @curl_init();
		curl_setopt($ch, CURLOPT_URL, $url);
		curl_setopt($ch, CURLOPT_HEADER, 1);
		curl_setopt($ch, CURLOPT_NOBODY, 1);
		curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
		curl_setopt($ch, CURLOPT_TIMEOUT, 15);
		$results = explode("\n", trim(curl_exec($ch)));
		$mime = "";
		foreach($results as $line) {
			if (strtok($line, ':') == 'Content-Type') {
				$parts = explode(":", $line);
				$mime = trim($parts[1]);
			}
		}
		return $mime=="audio/mpeg";
	} else {
		return false;
	}
}
Share


Aug 18 2011

How to use Instagr.am photos on your site

Category: Php,Spiders & web botsGiulio Pons @ 4:11 pm

The question is: how can I bring my instagram photos on my personal web site?

With this tutorial you can make this.

I’m addicted to Instagram, I make photos quite every day with it. But I’m also a developer, and I already have a personal portfolio for my photos… which became obsolete since it’s a year that I don’t update it. Instagr.am is only on iPhone. There isn’t your page with photos online, but there are few APIs that can be used to retrieve your photos and infos and, there isn’t any public feed to retrieve photos.

So long: how can I bring my instagram photos on my personal web site? Thanks to followgram.me you can create your “vanity url” that is to say that you can register simply by logging in with your instagram account (oAuth is performed) and create a simple and nice url with your photos, like followgram.me/giuliopons. This page, with his customizable UI, can be OK for many users, but not for me. :-) I want only photos, and nothing else matters.
Since on my vanity URL there is also a feed that publishes, in a public feed, my pictures I’ve decided that this will be my door to my photos, without using the instagram API and their oAuth integration.

I’ve written the following code that read the feed, parse it, get pictures links and infos, store data (not pictures) in a local file (and periodically check if there are new photos without overloading followgram.me server) and output it in a minimalistic html good for any pc, iphone or ipad.

This is the result: www.ku-ku.it

So, if you want to do the same, make your vanity url on Followgram.me and use this code in your site:

This is the content of my index.php file.

<?
ini_set('default_charset', 'UTF-8');

// ----------------------------------------------------------------------
// CONFIG
$instagram_user = "giuliopons"; // your instagram username
$cachetime = 2; // 2 hours
$file = $instagram_user."_instagram.txt"; // file used to cache content
$TITLE = "Foto di Giulio Pons e Roberta Casaliggi con instagr.am"; // your page title
// ----------------------------------------------------------------------

function getFollowgram($u) {
	// function read instagram feed through followgram.me service, thanks Fabio Lalli
	// twitter @fabiolalli
	$url = "http://followgram.me/".$u."/rss";
	$s = file_get_contents($url);
	preg_match_all('#<item>(.*)</item>#Us', $s, $items);
	$ar = array();
	for($i=0;$i<count($items[1]);$i++) {
		$item = $items[1][$i];
		preg_match_all('#<link>(.*)</link>#Us', $item, $temp);
		$link = $temp[1][0];
		preg_match_all('#<pubDate>(.*)</pubDate>#Us', $item, $temp);
		$date = date("d-m-Y H:i:s",strtotime($temp[1][0]));
		preg_match_all('#<title>(.*)</title>#Us', $item, $temp);
		$title = $temp[1][0];
		preg_match_all('#<img src="([^>]*)">#Us', $item, $temp);
		$thumb = $temp[1][0];
		$ar['date'][$i] = $date;
		$ar['image'][$i] = str_replace("_5.jpg","_6.jpg",$thumb);
		$ar['bigimage'][$i] = str_replace("_5.jpg","_7.jpg",$thumb);
		$ar['link'][$i] = $link;
		$ar['title'][$i] = $title;
	}
	return $ar;
}

function checkValidFile($f,$hours) {
	// Function thar check if a file is older than X hours
	$durata=60*60 * $hours;
	$daquanto=$durata+1;
	if (file_exists($f)) $daquanto=time()-filemtime($f); else return false;
	if ($daquanto<=$durata) {
		// existing file is still valid
		$t = ($durata-$daquanto);
		$s = $t % 60;
		$m = floor($t/60) % 60;
		$h = floor($t/3600) % 24;
		$g = floor($t/3600/24);
		return true;
	} else {
		// existing file is old
		return false;
	}
}

// -----------------------------------------------
// load cached file
if (!file_exists($file)) $archive=""; else {
	$rHandle = fopen($file, 'r');
	$archive = fread($rHandle, filesize($file));
	fclose($rHandle);
}

// -----------------------------------------------
// check for new feed every X hours
if(!checkValidFile($file, $cachetime)) {
	$r = getFollowgram($instagram_user);
	// add new images to archive file
	for ($i=floor(count($r,COUNT_RECURSIVE)/count($r)); $i>=0; $i--) {
		if($r['image'][$i]) {
			$temp = "<li><img src='".$r['image'][$i]."'/><span>".$r['date'][$i]."<br/>".$r['title'][$i]."</span></li>";
			if(!stristr($archive,basename($r['image'][$i]))) $archive = $temp.$archive;
		}
	}
	// save new file
	$f = fopen($file,'w');
	fwrite($f,$archive);
}

?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html>
<head>
	<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
	<meta http-equiv="content-language" content="it" />
	<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.4/jquery.min.js"></script>
	<meta content="yes" name="apple-mobile-web-app-capable" />
	<meta content="minimum-scale=1.0, width=device-width, maximum-scale=0.6667, user-scalable=no" name="viewport" />
	<title><?=$TITLE?></title>
	<style>
		body { background-color:#000; margin:0;padding:3px;}
		ul,li {list-style:none;margin:0 auto;padding:0;}
		li { float:left; width:306px;height:306px;position:relative;}
		li span { position:absolute; left:0;bottom:0;background-color:#000;color:#fff; height:auto;font-family:trebuchet ms,trebuchet;font-size:12px;width:300px;padding:3px;}
	</style>
	<script>
		$(document).ready(function() {
			$('li span').hide();
			$('li').mouseenter(function(){ $(this).find('span').show(); });
			$('li').mouseleave(function(){ $(this).find('span').hide(); });
		} );
	</script>
</head>
<body>
<ul><?=$archive?></ul>
</body>
</html>

I hope this will be enaugh for you to use it on your site. :)

Share

Tags: , , , , , ,


Feb 21 2011

get MySpace events with a PHP function

Category: Php,Spiders & web botsGiulio Pons @ 12:41 pm

Here is a function to read the concerts for a myspace band page. This code retrieves the “shows page” for a specified myspace username, and than parse the html to find and decode data.

Since myspace returns a page in Italian (this probably depends on geographic ip translations) the fnction uses a months array in italian. Probably you should change this, or you can try to make it better by adding some header to curl to specify the language of the page (I think it’s possible).

You can watch a DEMO here.

function myspaceConcerts($user) {
	$ch = curl_init("http://www.myspace.com/".$user."/shows");
	curl_setopt($ch, CURLOPT_HTTPGET, TRUE);
	curl_setopt($ch, CURLOPT_POST, FALSE);
	curl_setopt($ch, CURLOPT_HEADER, false);
	curl_setopt($ch, CURLOPT_NOBODY, FALSE);
	curl_setopt($ch, CURLOPT_VERBOSE, FALSE);
	curl_setopt($ch, CURLOPT_REFERER, "");
	curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
	curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
	curl_setopt($ch, CURLOPT_MAXREDIRS, 4);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
	curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; he; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8");
	$page = curl_exec($ch);
	// look for band name
	preg_match_all("#<a class=\"userLink\" href=\"/".$user."\">(.*)</a>#Us", $page, $a);
	$band = trim(strip_tags($a[1][0]));
	//
	// months array is in italian because from my web server pages come in italian
	// probably you have to change this array to match myspace response
	$months = array("gen"=>"01","feb"=>"02","mar"=>"03","apr"=>"04","mag"=>"05","giu"=>"06","lug"=>"07","ago"=>"08","set"=>"09","ott"=>"10","nov"=>"11","dic"=>"12");
	$out = array();
	$c=0;	// concerts counter
	$li = preg_split("/<li class=\"moduleItem event( odd| even)?( first| last)? vevent\" ?>/i",$page);
	for($i=0;$i<count($li);$i++) {
		if(stristr($li[$i],"<div class=\"entryDate\">")) {
			// find date
			preg_match_all("#<span class=\"month\">(.*)</span>#Us", $li[$i], $temp);
			$month = $months[strip_tags(trim($temp[1][0]))];
			preg_match_all("#<span class=\"day\">(.*)</span>#Us", $li[$i], $temp);
			$day = str_pad( strip_tags(trim($temp[1][0])), 2, "0", STR_PAD_LEFT);
			$year = date("Y");
			$data = $year."-".$month."-".$day;
			if($data<date("Y-m-d")) { $data = (date("Y")+1)."-".$month."-".$day; }

			// find venue
			preg_match_all("#<h4>(.*)</h4>#Us", $li[$i], $temp);
			$posto = strip_tags(trim($temp[1][0]));
			preg_match_all("#<span class=\"locality\">(.*)</span>#Us", $li[$i], $temp);

			// find city
			$citta = strip_tags(trim($temp[1][0]));
			preg_match_all("#<span class=\"region\">(.*)</span>#Us", $li[$i], $temp);

			// find region
			$region = strip_tags(trim($temp[1][0]));
			preg_match_all("#<span class=\"country-name\">(.*)</span>#Us", $li[$i], $temp);

			// find country
			$stato = strip_tags(trim($temp[1][0]));

			// build output array
			$out[$c]["band"] = $band;
			$out[$c]["date"] = $data;
			//$out[$c]["time"] = ""; not parsed
			$out[$c]["venue"] = $posto;
			//$out[$c]["url"] = ""; not parsed
			$out[$c]["where"] = $citta.",".$region.",".$stato;
			$c++;
		}
	}
	return $out;
}

This function is included in the Mini Bot Class with many other small spiders.

Share

Tags: , , , , , ,


Next Page »