I’m working with an Airtable base that contains approximately 24,000 website URLs. Many of these URLs have formatting issues like missing slashes or extra spaces that cause them to break. I need to identify which URLs are problematic so I can fix them manually.
My current approach
I’ve been using a fetch-based script to test each URL and check its status:
const config = input.config();
const websiteUrl = config.websiteUrl;
let responseStatus;
try {
const result = await fetch(websiteUrl);
responseStatus = result.status;
} catch (err) {
responseStatus = 'failed';
}
output.set('responseStatus', responseStatus);
Problems I’m facing
- My script doesn’t handle redirects properly - it returns ‘failed’ even when the URL works but redirects to another page
- I only get either ‘200’ for working URLs or ‘failed’ for broken ones. I want to see the actual HTTP status codes like 404, 301, 500, etc.
How can I modify this script to handle redirects correctly and capture specific error codes? Any suggestions would be really helpful!
Your script has two main issues: fetch follows redirects by default, and your catch block is too broad. When there’s a redirect, fetch provides the final status code. However, network errors or CORS issues are caught and reported as ‘failed’.
To improve this, check response.ok first and then inspect the specific status code. Remember, fetch will not throw exceptions for HTTP errors like 404 or 500; it only throws for network failures.
Last year, I faced a similar issue validating URLs in a CMS, and adding a timeout parameter was extremely beneficial. Additionally, logging response.url can help identify redirects, as some URLs that appear broken may actually redirect through multiple paths. Lastly, consider using HEAD requests instead of GET, as you only need the status codes, which can save bandwidth.
The problem is how fetch handles different response types. Network failures (DNS issues, timeouts) throw exceptions that your catch block grabs. But HTTP errors like 404 or 500 don’t throw exceptions - they return normal Response objects with that status code.
I hit this same issue auditing legacy URLs during a client migration. You need to separate network errors from HTTP errors. Ditch your current try-catch setup and handle these scenarios differently. Also, fetch automatically follows up to 20 redirects by default, so you’re seeing the final destination’s status, not the original redirect.
For debugging, add response.redirected and response.url to your output. This shows when redirects happen and what the final URL is. With 24,000 URLs to check, proper error categorization will save you tons of manual work.