Unable to get Content Encoding using PHP cURL

231 Views Asked by At

I am using cURL with PHP to get content type and content encoding. I am successfully getting content type but content encoding value is empty.

function get_content_type_curl($url_content_type) {
    
    $agent_content_type = $_SERVER['HTTP_USER_AGENT'];
    $ch_content_type = curl_init();

    curl_setopt($ch_content_type, CURLOPT_URL, $url_content_type);
    curl_setopt($ch_content_type, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch_content_type, CURLOPT_HEADER, 0);
    curl_setopt($ch_content_type, CURLOPT_NOBODY, 1);
    curl_setopt($ch_content_type, CURLOPT_USERAGENT, $agent_content_type);
    curl_setopt($ch_content_type, CURLOPT_FOLLOWLOCATION, 1);

    curl_exec($ch_content_type);
    $content_type = curl_getinfo($ch_content_type, CURLINFO_CONTENT_TYPE);
    $content_encoding = defined('CURLINFO_CONTENT_ENCODING') ? curl_getinfo($ch_content_type, CURLINFO_CONTENT_ENCODING) : '';
    //$content_encoding = curl_getinfo($ch_content_type, CURLINFO_CONTENT_ENCODING);

    curl_close($ch_content_type);

    return array("content_type" => $content_type, "content_encoding" => $content_encoding);
}

$result = get_content_type_curl("https://affiliatefix.com/sitemap-1.xml");

echo $result["content_type"] . "\n";
if (!empty($result["content_encoding"])) {
    echo $result["content_encoding"] . "\n";
}

/**if (strpos($result["content_encoding"], "gzip") !== false) {
    echo $result["content_encoding"] . "\n";
} else {
    echo "No encoding".$result["content_encoding"] . "\n";
}**/

Output for https://affiliatefix.com/sitemap-1.xml :

Content Type : application/xml; charset=utf-8 //successfully getting

Content encoding : gzip //I'm getting empty.

enter image description here

1

There are 1 best solutions below

3
shingo On

Not sure how you found this constant CURLINFO_CONTENT_ENCODING. It doesn't appear in php documents or cURL documents. To get the response header, you need to register a callback function like this:

curl_setopt($ch_content_type, CURLOPT_HEADERFUNCTION, function($ch, $header){
    if(stripos($header, 'content-encoding') === 0){
        #parse content_encoding here.
    }
    return strlen($header);
});

Another way is set CURLOPT_HEADER then truncate the header manually. of couse since you doesn't need the body, the returned string is the whole header:

curl_setopt($ch_content_type, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch_content_type, CURLOPT_HEADER, 1);
curl_setopt($ch_content_type, CURLOPT_NOBODY, 1);
$header_and_body = curl_exec($ch_content_type);

$header_size = curl_getinfo($ch_content_type, CURLINFO_HEADER_SIZE);
$header = substr($header_and_body, 0, $header_size);