Should I include an auto-increment ID column in MySQL junction tables for performance?

I’m working with a MySQL database that has these three tables:

  • articles
  • tags
  • article_tag

The relationship is many-to-many where each article can have multiple tags. The article_tag serves as my junction table with these fields:

  • primary_id (auto-increment primary key)
  • tag_id
  • article_id

With around 1 million records in the junction table, I’m wondering if keeping that auto-increment primary_id column actually helps with performance or if it’s just wasting storage space? Would it be better to use a composite primary key instead?

Having faced similar issues with junction tables in production, I’ve found that an auto-increment ID can often hinder performance in many-to-many relationships. Instead, utilizing a composite primary key on (article_id, tag_id) is typically more efficient, especially for common queries like retrieving tags for specific articles. The added integer column can consume significant storage with millions of records, plus managing an unnecessary index can complicate things. Retain the auto-increment ID only if necessary for foreign key relationships or if required by your ORM. I recommend testing both methods with your specific query patterns to determine the best approach.