Measuring a duration with microtime results randomly in zero

79 Views Asked by At

I am having a loop like this:

<?php
ini_set('memory_limit', '16024M');
ini_set('set_time_limit', 9999);
ini_set('max_execution_time', 9999);
ini_set('display_errors',  TRUE);
ini_set('error_reporting',  E_ALL);

for ($k = 1; $k <= 50; $k++) {

    $haystack = array();

    for ($i = 1; $i <= 100; $i++) {

        $randomChar = substr(md5(microtime()),rand(0,26), 1);

        $haystack[] = $randomChar;

    }

    $haystack[] = 'X';

    $startTime = microtime(true);

    // sleep(0);

    $result = in_array('X', $haystack);

    $endTime = microtime(true);

    echo number_format(1000000 * ($endTime - $startTime), 20, ",", " ") . ' ';

 }

And these are the first couple of lines from the output:

1,90734863281250000000 0,95367431640625000000 1,19209289550781250000 1,90734863281250000000 1,19209289550781250000 0,95367431640625000000 0,95367431640625000000 1,90734863281250000000 0,95367431640625000000 20,02716064453125000000 0,95367431640625000000 1,19209289550781250000 0,95367431640625000000 0,95367431640625000000 0,00000000000000000000 0,95367431640625000000 0,95367431640625000000 0,95367431640625000000 0,00000000000000000000 0,95367431640625000000 0,00000000000000000000

As you can see, there are a couple of lines stating a duration of "0" - which is in fact not possible. If I uncomment the line containing the sleep(0) command, there is no zero-duration.

System-Setup

  • PHP 7.0 with FPM
  • nginx 1.10.3
  • Ubuntu 16.04

I am running the loop on the CLI and calling it via the Browser.

2

There are 2 best solutions below

6
num8er On BEST ANSWER

101 items in array is small enough for smart php with it's static optimization tricks and powerful cpu.

If You want to see that 0-s are gone, so generate 1000 items:

for ($i = 1; $i <= 1000; $i++) {
    $haystack[] = substr(md5(microtime()),rand(0,26), 1);
}

P.S. I've checked Your code using both 7.1 and 5.6 so there are big differences:

php7.1 vs php5.6

0
n.r. On

Just in addition to @num8er answer, which seems to be THE answer, I tried to find out more, because this really caused me some sleepless nights. I improved the above script a bit and ran some additional measurements:

  ini_set('memory_limit', '16024M');
  ini_set('set_time_limit', 9999);
  ini_set('set_time_limit', -1);
  ini_set('max_execution_time', 9999);
  ini_set('max_execution_time', -1);
  ini_set('display_errors',  TRUE);
  ini_set('error_reporting', E_ALL);

echo "<table>";
echo "<tr>";
    echo "<th>duration</th>";
    echo "<th>position</th>";
    echo "<th>fake</th>";
    echo "<th>found</th>";
    echo "<th>optimized</th>";
echo "</tr>";

$endPosition = TRUE;

$fake = false;

for ($k = 1; $k <= 10000; $k++) {

    $haystack = array();

    for ($i = 1; $i <= 50000; $i++) {

        $randomChar = substr(md5(microtime()),rand(0,26), 1);

        $haystack[] = $randomChar;

    }

    if ($fake) {

        $needle = NULL;


    } else {

        if ($endPosition) {

            $needle = $haystack[sizeof($haystack) - 1];

        } else {

            $needle = $haystack[floor(sizeof($haystack)/ 2)];

        }

    }

    $startTime = microtime(true);

    //sleep(0);

    $result = in_array($needle, $haystack);

    $endTime = microtime(true);

    $duration = ($endTime - $startTime);

    echo "<tr>";
        echo "<td>";
        echo number_format($duration, 30, ",", " ");
        echo "</td>";
        echo "<td>";
        echo ($endPosition) ? "end": "middle";
        echo "</td>";
        echo "<td>";
        echo ($fake) ? "fake": "no fake";
        echo "</td>";
        echo "<td>";
        echo ($result) ? "found": "not found";
        echo "</td>";
        echo "<td>";
        echo ($duration == 0) ? "optimized": "---";
        echo "</td>";
    echo "</tr>";

    $endPosition = (rand(0,100) < 50) ? TRUE : FALSE;
    $fake = (rand(0,100) < 25) ? TRUE : FALSE;

}

echo "</table>";

I added a random "fake feature". Randomly 25% of the iterations should not return a positive search result. And in random 50% of the iterations, the needle will be placed in the middle of the haystack and not at the end. I ran this script a couple of times for different settings (iterations, array length) and at the end I had around 225.000 result rows. Quickly adding a little pivot table shows, where PHP (7.0.32 fpm and the CPU (Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz) reaching a limit:

minimum durations when searching arrays, lengths vs iterations

Numbers are milliseconds / 1000, so even the hard ones (like 500.000 keys, 1.000 iterations) took 0,000000953674 Microseconds - thanks to the optimization. That's impressive.

What also is interesting: The minimum durations, if not "0", are the same (0,000953674) or doubled (0,000001907349), even for different iterations! So, my assumption is, but that's pretty naive thinking, if I would run a test with bigger arrays or more iterations, the next upcoming minium would be 0.00000381469 microseonds.

As you also can see, and as num8er already stated out, the potential for optimization grows the harder the job is.

Top 10 of quickest durations for 50.000-key-arrays

10 times crawling arrays with a length of 50.000 keys is even slower then 100 or 1.000 iterations. Of 1.000 iterations more than 10% are of the results were delivered in the "optimized" time.

Finally, I want to point out, that there seems to be no difference, if the needle is in the middle of the haystack, or at the end. Next chart shows the minium durations for 10, 100 and 1.000 iterations when searching a 500.000 key array. As you can see, the minium is always the "magical" 0,000000953674:

Minimum durations for 500.000-key-arrays with different needle positions

Needless to say, that every iteration returns the correct result. So, in_array() never returned a positive result, when it crawled the haystack-array containing no needle.

That may does not add deeper technical details to the PHP-optimization feature, but yet I guess it's interesting to see the impact of this feature.