I’m working on a REST web service for file downloads using Java. The issue I’m facing is that when users download files through their browser, Spanish accented characters get removed from the filename.
@GET
@Path("/getfile")
@Produces(MediaType.APPLICATION_OCTET_STREAM)
public Response retrieveFile(@QueryParam("fileId") String fileId) throws IOException {
try {
DocumentStream result = DocumentService.fetchDocument(Integer.parseInt(fileId));
ResponseBuilder builder = Response.ok((Object) result.getFileStream());
System.out.println("Original name: " + result.getDocumentData().getFileName());
System.out.println("Encoded name: " + DocumentModel.convertToISO(result.getDocumentData().getFileName()));
builder.header("Content-Disposition", "attachment;filename=" + result.getDocumentData().getFileName());
return builder.build();
} catch (Exception ex) {
return Response.status(500).entity("File download failed").build();
}
}
Here’s my encoding utility:
public static String convertToISO(String text) {
return new String(text.getBytes(StandardCharsets.ISO_8859_1), StandardCharsets.UTF_8);
}
The database returns the correct filename: documentó año 2021.xlsx
But when I apply encoding: DocumentModel.convertToISO(...) I get: document� a�o 2021.xlsx
In the browser download, both approaches result in: document a o 2021.xlsx
I’ve tested different charset configurations including UTF-8 and ISO-8859-1 but the accented characters still disappear. What configuration am I missing to preserve Spanish characters in downloaded filenames?
Your convertToISO method is corrupting the filename by incorrectly converting between charsets. The Content-Disposition header needs proper RFC encoding for non-ASCII characters.
Use RFC 2231 encoding instead:
builder.header("Content-Disposition", "attachment; filename*=UTF-8''" + URLEncoder.encode(result.getDocumentData().getFileName(), StandardCharsets.UTF_8));
This tells the browser the filename is UTF-8 encoded. I had the same problem with Portuguese characters and this fixed it completely. Just remove that convertToISO method entirely - it’s what’s causing the corruption in your logs.
Your charset handling is the problem. For Content-Disposition headers with international characters, you need both a fallback filename and RFC 5987 encoding to work across all browsers. I ran into the same issue with German umlauts in my enterprise app. The fix is using two filename parameters - an ASCII fallback and a proper UTF-8 version:
String originalName = result.getDocumentData().getFileName();
String asciiName = originalName.replaceAll("[^\\x00-\\x7F]", "");
String encodedName = URLEncoder.encode(originalName, StandardCharsets.UTF_8);
builder.header("Content-Disposition",
"attachment; filename=\"" + asciiName + "\"; filename*=UTF-8''" + encodedName);
This gives you backward compatibility and makes sure modern browsers show accented characters right. Ditch your convertToISO method completely - it’s doing bad charset conversion that messes up the original UTF-8 data.
Had the same issue last month with French filenames. You’re mixing ISO and UTF-8 the wrong way. Try URLEncoder.encode(filename, "UTF-8").replace("+", "%20") and set your header like "attachment; filename=\"" + encodedName + "\"" - worked perfectly for château.pdf and other accented files.
This topic was automatically closed 4 days after the last reply. New replies are no longer allowed.