I’m working with a database table named route_data
that contains approximately 1 million records. I need help optimizing a query that’s currently taking several seconds to execute.
My current query looks like this:
SELECT * FROM (
SELECT * FROM route_data WHERE station_id='DEF' AND status=1
UNION
SELECT * FROM route_data WHERE station_id='GHI'
UNION
SELECT * FROM route_data WHERE station_id='JKL'
) AS combined_results
GROUP BY combined_results.train_id
ORDER BY departure_time
The problem is that this query keeps showing using temporary, using filesort
in the execution plan. I’ve tried creating various indexes but nothing seems to help. The query combines multiple rows from the same table based on different station_id
values, then groups by train_id
.
Here’s my table structure:
CREATE TABLE `route_data` (
`train_id` varchar(1000) NOT NULL,
`sequence_num` int(11) NOT NULL,
`station_id` varchar(1000) NOT NULL,
`location_name` varchar(1000) NOT NULL,
`arrival_time` time NOT NULL,
`departure_time` time NOT NULL,
`stop_duration` varchar(1000) NOT NULL,
`platform` varchar(1000) NOT NULL,
`travel_day` int(11) NOT NULL,
`distance` varchar(1000) NOT NULL,
`velocity` varchar(1000) NOT NULL,
`altitude` varchar(1000) NOT NULL,
`region` varchar(1000) NOT NULL,
`location` varchar(1000) NOT NULL,
`status` int(11) DEFAULT '1',
KEY `composite_idx1` (`train_id`(767),`station_id`(767)),
KEY `seq_idx` (`train_id`(767),`sequence_num`),
KEY `status_time_idx` (`station_id`(767),`status`,`departure_time`),
KEY `station_status_idx` (`station_id`(767),`status`),
KEY `train_time_idx` (`train_id`(767),`departure_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
What would be the best approach to rewrite this query and create proper indexes to get millisecond response times?