July 8, 2005 23:38 | WebTech

Little PHP help pleeeease?

I'm wanting to write a little function to use in a WordPress template which will do the following:

Taking an input haystack (which would be an entire weblog entry), find the first occurrence of the needle ".mp3". First chop off anything that follows it, then move backwards until you find the beginning of the second needle, "http://", and chop off everything that precedes that.

In other words, I want to extract the full URI of an MP3 embedded in a weblog entry.

The reasons must be obvious if you consider some of my most recent postings.
This code would probably get released as a WordPress plugin once I've put it though it's paces. There's loads of functionality that could be added but this is all I need myself for now. :)

I'd really really really appreciate any help anyone can provide. Credit will of course be given where credit is due. ;)

(Oh and if you know of such a plugin already in existence, advance thanks for marking it out.)

Comments

i know, the following is not perfect (i'm not a regex guru) but it will probably work.

$haystack = "the weblog entry";
$matches = array();
preg_match_all("?http://[a-z0-9\./_]+\.mp3?i", $haystack, $matches);
print_r($matches[0]);


Fantastisch! Danke! You took it up a notch; by using preg_match_all, we can pull out all linked mp3s. Sweet.

I should prolly be picky and look for all audio/video media files... :)
So that regex would become something like:

"?http://[a-z0-9\./_]+\.(mp3|m4a|aac|ogg|aif+|mov|mpg|mpeg|avi|)?i"

Awesome. I'm gonna go play now. Max, Ich kenne dich nicht, but I owe ya! Not only did you take the time to help me with the code, but you also single-handedly taught me regex (by providing me this complex, yet simple usage example).


Doh, while you've set me on the right track and I have built everything I need around it, the regex itself isn't matching anything. :\


Thanks to CRW for coming though with some regex debugging.

The funtion as it stands (this is a Wordpress plugin function remember, hence the $post->post_content object):

function extract_media_uri() {
global $post;
$haystack = $post->post_content;
preg_match_all("?http://[a-z0-9\./\-_]+\.mp3?i", $haystack, $matches, PREG_PATTERN_ORDER);
print_r($matches[0][0]);
}

Now I need to code up what to do with the URL we've pulled out. Wheeee!


After looking at a few rfcs, here's what I came up with :


nice, the preview showed the source, but it got stripped when I finally posted. You can see it here :
http://newton.waglo.com/~millette/bopuc.phps


oh nice!!! thanks Robin!

For posterity:

<?php

$pattern = "/(http:\/\/[\\:@;\?=\+\$,\-_!~\*'\(\)\.\/a-z0-9A-Z]+\.mp3)/";
$haystack =<<<E_O_T
le voici : "<em>http://example.com:8090/~Millette/artiste?song=(ok)chanson_5@;+$,.mp3</em>" hop.
E_O_T;

echo $haystack;
echo '<br>';
if (1 === preg_match($pattern, $haystack, $matches)) {
$url = $matches[1];
echo $url;
}
?>