Is there a regex-based string replacement function in MySQL?

I have a database with about 500,000 entries. One of the fields, a varchar(255) UTF8 column named file_name, contains file names.

I want to remove unusual characters from these file names using a pattern similar to [^a-zA-Z0-9()_ .\-].

Is there a MySQL function that allows regex-based string replacement? I am looking for something that behaves like the REPLACE() function but utilizes regular expressions.

SELECT REPLACE('example', 'ex', 're');
-- Output: "remample"

/* Does such a function exist? */
SELECT REGEX_REPLACE('Example','/[A-Za-z]/','*'); 
-- Expected: "*x*mpl*"

I am familiar with REGEXP/RLIKE, which only checks for matches and doesn’t actually perform replacements.

Alternatively, I could execute a query like SELECT id,file_name FROM my_table WHERE file_name RLIKE '[^a-zA-Z0-9()_ .\-]' from a PHP script and use preg_replace, but that feels like a clunky solution.

MySQL 8.0 has REGEXP_REPLACE which is perfect for this. Syntax is REGEXP_REPLACE(expr, pattern, replacement [, position [, occurrence [, match_type]]]). You’d use something like SELECT REGEXP_REPLACE(file_name, '[^a-zA-Z0-9()_ .\-]', '', 1, 0, 'g') FROM my_table to strip out unwanted characters. The ‘g’ flag replaces all matches, not just the first one. If you’re stuck on MySQL 5.7 or older, this won’t work - you’ll need to upgrade or stick with your PHP approach. With 500k records, expect it to be slower than basic REPLACE operations. Run it during off-peak hours or in batches if performance becomes a problem.

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.