Facebook Social Icon Twitter Social Icon YouTube Social Icon

Misleading Output of MySQL Profiling Feature

From M.Eng. René Schwarz, Bremen/Merseburg
Jump to: navigation, search

During the ongoing work on my Master's Thesis I noticed a bug in the MySQL profiling feature: The output of the SHOW PROFILE statement is misleading, since the presented durations are not given for the status in the same row, but the durations are given for the previous status!

Issue Description

I have a MySQL database test on a MySQL 5.5.20 x64 Windows Community server. In this database resides one large MyISAM table test_table with the following structure (the table uses MySQL Spatial Extensions since it contains a global, high resolution digital elevation model of the Moon's surface):

CREATE TABLE IF NOT EXISTS `test_table` (
  `point_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT 'point ID',
  `line` int(10) unsigned NOT NULL,
  `sample` int(10) unsigned NOT NULL,
  `lat` double NOT NULL COMMENT 'latitude [deg]',
  `lon` double NOT NULL,
  `height` double NOT NULL,
  `planetary_radius` double NOT NULL,
  `x` double DEFAULT NULL,
  `y` double DEFAULT NULL,
  `z` double DEFAULT NULL,
  `point` point NOT NULL,
  PRIMARY KEY (`point_id`),
  SPATIAL KEY `point_index` (`point`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1;

This table contains over 265 millions of data rows, consuming approx. 27 GiB for data and 35 GiB for index storage. One of my tasks is to query this table for spatial data in a very performant way. In order to optimize my database queries, I profiled some common spatial queries to the database by using the MySQL profiling feature.

One query, selecting points within a given polygon, looks like:

SELECT lat, lon, x, y, z
FROM test_table
WHERE
    MBRContains(GeomFromText('POLYGON((-0.1 0, -0.1 5.1, 5.1 5.1, 5.1 0, -0.1 0))'), point) OR
    MBRContains(GeomFromText('POLYGON((-0.1 359.9, -0.1 360, 5.1 360, 5.1 359.9, -0.1 359.9))'), point)
ORDER BY point_id ASC;

The profiling is done with a quick & dirty written PHP script. The following code snippet shall only demonstrate the functional principle; the real script has some code more for a nice formatted output:

$query = "
    SELECT lat, lon, x, y, z
    FROM test_table
    WHERE
        MBRContains(GeomFromText('POLYGON((-0.1 0, -0.1 5.1, 5.1 5.1, 5.1 0, -0.1 0))'), point) OR
        MBRContains(GeomFromText('POLYGON((-0.1 359.9, -0.1 360, 5.1 360, 5.1 359.9, -0.1 359.9))'), point)
    ORDER BY point_id ASC;
";

$mysqli = new mysqli($mysql_host, $mysql_user, $mysql_pwd, $mysql_db);
$mysqli->query("SET profiling=1;");
$res = $mysqli->query($query);
$stat = $mysqli->query("SHOW PROFILE CPU FOR QUERY 1;");
// some code here for formatting and
// displaying the profiling information
$mysqli->close();

The profiling of the aforementioned SQL statement gives the following output:


MySQL Profiling Spatial Query with Order Statement.png


As you can see, nearly all the query time is spent to sorting the result (381.274899 seconds). So I tried to optimize my MySQL server instance configuration, modifying the MySQL server variables — but nothing happend. There was no acceleration.

My next idea was to remove the ORDER BY statement in the query; that would be possible but I had to change my other software to cope with the unsorted data, which would have been not a great deal. So I profiled the following MySQL query without the ORDER BY statement:

SELECT lat, lon, x, y, z
FROM test_table
WHERE
    MBRContains(GeomFromText('POLYGON((-0.1 0, -0.1 5.1, 5.1 5.1, 5.1 0, -0.1 0))'), point) OR
    MBRContains(GeomFromText('POLYGON((-0.1 359.9, -0.1 360, 5.1 360, 5.1 359.9, -0.1 359.9))'), point);

Profiling gives:


MySQL Profiling Spatial Query without Order Statement.png


According to the MySQL profiling, the process of sending the data to the client, which previously took about 5 seconds now took 438 seconds! What the hell?!

Now things got clear to me: The output of the MySQL profiling information is shifted line by line! The execution of the statement took 438 seconds, not sending the data to the client. And for the first statement, the process of sorting the result took only 5 seconds.

After some research I noticed, that nearly all instructions and optimization tutorials available on the web are not aware of this fact, which seems to be known to the MySQL developers since almost two years (see MySQL bug #52492: "Issue with output of SHOW PROFILE").

The MySQL documentation states, that the "STATE is the profiling state to which the row measurements apply". So this is clearly a bug in MySQL you should be aware of, if you are unwilling to spent unnecessary time to the optimization of MySQL statements/the MySQL server configuration or parts of it!