I have a weather database table with columns for TIMESTAMP, VELOCITY, and BEARING. Each minute, new records are added, and the TIMESTAMP column is indexed.
For calculating hourly wind averages, I use PHP as shown below:
$query = "SELECT * FROM weather_readings WHERE TIMESTAMP BETWEEN startTime AND endTime ORDER BY TIMESTAMP ASC";
$data = $database->query($query)->fetchAll(PDO::FETCH_ASSOC);
foreach($data as $record)
{
$count++;
$velocity = $record['VELOCITY'];
$radians = deg2rad($record['BEARING']);
$xComponent += $velocity * sin($radians);
$yComponent += $velocity * cos($radians);
}
if ($count > 0)
{
$meanVelocity = sqrt($xComponent * $xComponent + $yComponent * $yComponent) / $count;
$meanBearing = rad2deg(atan2($xComponent, $yComponent));
if ($meanBearing < 0) $meanBearing += 360;
}
This method is efficient and runs quickly. To expand my database knowledge, I created a stored procedure named GetAverageWind that processes the same data:
DECLARE MEANVELOCITY, MEANBEARING, vel, bear FLOAT;
DECLARE finished BOOL DEFAULT FALSE;
DECLARE dataCursor CURSOR FOR SELECT VELOCITY, BEARING FROM weather_readings WHERE TIMESTAMP >= startTime AND TIMESTAMP <= endTime;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET finished = TRUE;
OPEN dataCursor;
process_loop: LOOP
FETCH dataCursor INTO vel, bear;
IF finished THEN
LEAVE process_loop;
END IF;
SET @angleRad = RADIANS(bear);
SET @sinValue = SIN(@angleRad);
SET @cosValue = COS(@angleRad);
SET @xSum = @xSum + vel * @sinValue;
SET @ySum = @ySum + vel * @cosValue;
SET @total = @total + 1;
END LOOP;
CLOSE dataCursor;
IF @total > 0 THEN
SET MEANVELOCITY = SQRT(@xSum * @xSum + @ySum * @ySum) / @total;
SET MEANBEARING = DEGREES(ATAN2(@xSum, @ySum));
WHILE MEANBEARING < 0 DO SET MEANBEARING = MEANBEARING + 360; END WHILE;
END IF;
SELECT MEANVELOCITY, MEANBEARING;
While both methods yield the same results, the stored procedure is much slower. I notice that the CPU load approaches 100% when using the stored procedure, while the PHP method barely impacts performance.
Given that I need to analyze each entry for vector computations, can anyone explain the performance discrepancies and suggest ways to optimize the stored procedure?