Jump to content

Welcome to Geeks to Go - Register now for FREE

Geeks To Go is a helpful hub, where thousands of volunteer geeks quickly serve friendly answers and support. Check out the forums and get free advice from the experts. Register now to gain access to all of our features, it's FREE and only takes one minute. Once registered and logged in, you will be able to create topics, post replies to existing threads, give reputation to your fellow members, get your own private messenger, post status updates, manage your profile and so much more.

Create Account How it Works
Photo

PHP Search


  • Please log in to reply

#1
Tigger93

Tigger93

    Trusted Helper

  • Retired Staff
  • 1,870 posts
I'm not sure if this is even possible, but I think I've seen something like it so I hope it is.

I want to use PHP to search a websites source page and retrieve anything like this for example: topic=* (where * is a wildcard), then list all the results its found. How could I go about doing this?

Any help is appreciated and if needed I'll try to explain it better. :whistling:

Edited by Tigger93, 22 April 2007 - 07:54 PM.

  • 0

Advertisements


#2
mpfeif101

mpfeif101

    Member 1K

  • Retired Staff
  • 1,411 posts
Will * be inside of quotes, like topic="hello"?

If not, your wild card * starts after the = sign, but where does it end?
  • 0

#3
Michael

Michael

    Retired Staff

  • Retired Staff
  • 1,869 posts
Your going to need something like this http://au3.php.net/curl to get the pages, and regular expressions to find what you want on it. I this would return an array of what I think you want.

preg_match('`topic="([^\"]*)"`',$pageText);
  • 0

#4
Tigger93

Tigger93

    Trusted Helper

  • Topic Starter
  • Retired Staff
  • 1,870 posts
Thanks, but I still don't quite understand what to do here, sorry! :whistling: Could you give me an example of how to do it, I've been testing around with it but can't get it to work.

mpfeif101, yes, it'll be inside quote like "topic=*".
  • 0

#5
Michael

Michael

    Retired Staff

  • Retired Staff
  • 1,869 posts
Ok hows this then

$url = "http://example.com/";
$ch = curl_init();
$timeout = 2; // set to zero for no timeout
curl_setopt ($ch, CURLOPT_URL, $urll);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$content = @curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);

if($info['http_code'] == '200') {
$topic = preg_match('`topic="([^\"]*)"`',$content);
print_r($topic);
}
  • 0

#6
Tigger93

Tigger93

    Trusted Helper

  • Topic Starter
  • Retired Staff
  • 1,870 posts
Ok, just tried it out, just shows 0 on the page. :whistling:

Here's what I got:

$url = "http://example.com/";
$ch = curl_init();
$timeout = 2; // set to zero for no timeout
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$content = @curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);

if($info['http_code'] == '200') {
$topic = preg_match('`http://example.com/topic="([^\"]*)"`',$content);
print_r($topic);
}

Any ideas?
  • 0

#7
Michael

Michael

    Retired Staff

  • Retired Staff
  • 1,869 posts
You just might want to change this

$url = "http://example.com/";
  • 0

#8
Tigger93

Tigger93

    Trusted Helper

  • Topic Starter
  • Retired Staff
  • 1,870 posts
Yes, I did. :whistling: I was just using that as an example for the website.
  • 0

#9
Michael

Michael

    Retired Staff

  • Retired Staff
  • 1,869 posts
$topic = preg_match('`http://example.com/topic=(.*)`',$content);

Try that regular expression.
  • 0

#10
Tigger93

Tigger93

    Trusted Helper

  • Topic Starter
  • Retired Staff
  • 1,870 posts
Nope. :whistling: Still just shows 0 on the page.
  • 0

Advertisements


#11
Michael

Michael

    Retired Staff

  • Retired Staff
  • 1,869 posts
What page are you useing, so I can test it?
  • 0

#12
Tigger93

Tigger93

    Trusted Helper

  • Topic Starter
  • Retired Staff
  • 1,870 posts
Here:

http://tigger93.geek...m/gettopics.php
  • 0

#13
Michael

Michael

    Retired Staff

  • Retired Staff
  • 1,869 posts
I should sleep more, my brain is not working right :whistling:

[codebox]
<?php

$url = "http://example.com/";
$ch = curl_init();
$timeout = 2; // set to zero for no timeout
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$content = @curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);

if($info['http_code'] == '200') {
$topic = preg_match_all('`http://example.com/topic=(.*)\n?"?`',$content,$matches);
print_r($matches[1]);
}

?>
[/codebox]
  • 0

#14
Tigger93

Tigger93

    Trusted Helper

  • Topic Starter
  • Retired Staff
  • 1,870 posts
Closer it looks like. Now it says Array ( ). Thanks for all the help so far. :whistling:
  • 0

#15
Michael

Michael

    Retired Staff

  • Retired Staff
  • 1,869 posts
What value are you useing for $url = "http://example.com/"; ?
  • 0






Similar Topics

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

As Featured On:

Microsoft Yahoo BBC MSN PC Magazine Washington Post HP