3

Hi I know its a very common topic on StackOverFlow. I have already spent my entire week to search it out.

I have a url : abc.com/default.asp?strSearch=19875379

this further redirect to this url: abc.com/default.asp?catid={170D4F36-39F9-4C48-88EB-CFC8DDF1F531}&details_type=1&itemid={49F6A281-8735-4B74-A170-B6110AF6CC2D}

I have made my effort to get the final url in my php code using Curl but can't make it.

here is my code:

<?php
$name="19875379";
$url = "http://www.ikea.co.il/default.asp?strSearch=".$name;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$a = curl_exec($ch);
curl_close( $ch ); 
// the returned headers
$headers = explode("\n",$a);
// if there is no redirection this will be the final url
$redir = $url;
// loop through the headers and check for a Location: str
$j = count($headers);
for($i = 0; $i < $j; $i++){
// if we find the Location header strip it and fill the redir var     
//print_r($headers);
if(strpos($headers[$i],"Location:") !== false){
        $redir = trim(str_replace("Location:","",$headers[$i]));
        break;
    }
}
// do whatever you want with the result
echo $redir;
?>

it gives me url "abc.com/default.asp?strSearch=19875379" instead of this url "abc.com/default.asp?catid={170D4F36-39F9-4C48-88EB-CFC8DDF1F531}&details_type=1&itemid={49F6A281-8735-4B74-A170-B6110AF6CC2D}"

Thanks in advance for your kind help :)

Avtansh
  • 81
  • 1
  • 1
  • 9

4 Answers4

2

Thank you everyone for helping me in my situation.

Actually I want to develop a scraper in php for ikea website used in Israel (in Hebrew). After putting a lot of hours I recognize that there is no server side redirection in url which I put to get the redirected url. It may be javascript redirection. I have now implemented the below code and it works for me.

<?php
$name="19875379";
$url = "http://www.ikea.co.il/default.asp?strSearch=".$name;

$ch = curl_init();
$timeout = 0;
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$header = curl_exec($ch);
$redir = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
//print_r($header);

$x = preg_match("/<script>location.href=(.|\n)*?<\/script>/", $header, $matches);
$script = $matches[0];
$redirect = str_replace("<script>location.href='", "", $script);
$redirect = "http://www.ikea.co.il" . str_replace("';</script>", "", $redirect);

echo $redirect; 
?>

Thanks again everyone :)

Avtansh
  • 81
  • 1
  • 1
  • 9
1

The accepted answer is applicable to a very specific scenario. So, most of us will be better off having a more general answer. Though you can extract the more general answer from within the accepted answer, separately having that part may be more helpful.

So, if you just want to get the last redirected URL, this code will help.

<?php

function redirectedUrl($url) {
    $ch = curl_init();

    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']); // set browser info to avoid old browser warnings
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // allow url redirects
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // get the return value of curl execution as a string
    
    $html = curl_exec($ch);
    
    // store last redirected url in a variable before closing the curl session 
    $lastUrl = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
    
    curl_close($ch);

    return $lastUrl;
}
arafatgazi
  • 341
  • 4
  • 5
0

You can use curl_getinfo() ...

http://php.net/manual/en/function.curl-getinfo.php

Muhammad Hassaan
  • 7,296
  • 6
  • 30
  • 50
Faizan
  • 766
  • 2
  • 7
  • 19
  • Thanks a lot for reading. I have made the changes as you have suggested and got the folloing output: – Avtansh Dec 14 '13 at 16:18
0

First of all, I didn't see any redirection while I have given a run on your code. Anyway, here are few things you can do for this(keeping your approach intact):

First of all, make sure that the header will be returned to your curl output(in this case at $a).

curl_setopt($ch, CURLOPT_HEADER, true);

Now, separates only the header portion from the whole http response.

// header will be at 0 index, and html will be at 1 index.
$header = explode("\n\r",$a);

Explode the header string into headers array.

$headers = explode("\n", $header[0]);
Sabuj Hassan
  • 38,281
  • 14
  • 75
  • 85