How to validate HTTP status codes for website URLs in Airtable records

I’m working with an Airtable base that contains approximately 24,000 website URL records. Many of these URLs have formatting issues like missing slashes or extra spaces that cause broken links. I need to identify which URLs are problematic so I can fix them manually.

My current approach

I’ve been using a fetch-based script to test each URL and check its status:

const config = input.config();
const websiteUrl = config.websiteUrl;
let responseStatus;

try {
    const result = await fetch(websiteUrl);
    responseStatus = result.status;
} catch (err) {
    responseStatus = 'failed';
}

output.set('responseStatus', responseStatus);

Problems I’m facing

  1. Redirects aren’t being handled properly - the script returns “failed” even when the URL works after a redirect
  2. I only get “200” for working URLs or “failed” for broken ones. I’d prefer to see the actual HTTP status codes (like 404, 500, etc.) to better understand what’s wrong

Any suggestions on how to improve this approach would be really helpful!

try adding redirect: 'manual' in your fetch options to handle redirects. also, keep an eye on the headers for location and remember to separate network errors from http errors - they work differently. for failure status codes, check if result.ok is false before calling it a fail.

I’ve hit the same URL validation issues before. Your error handling needs work - don’t lump everything into ‘failed’. Check result.status properly since 404s and 500s actually tell you something useful. Only catch actual network failures like CORS or DNS problems in your try-catch block. With 24,000 records, you’ll want delays between requests or you’ll get rate limited. Many sites will block you for hammering them too fast.

also worth mentioning - set proper user-agent headers or sites will auto-block you. check result.redirected to see if the url redirected, then grab result.url for the final destination. super helpful when you’re fixing broken links later.

For large scale URL validation, I split error handling into specific categories instead of using one catch-all block.

Here’s my approach:

const config = input.config();
const websiteUrl = config.websiteUrl;
let responseStatus;

try {
    const result = await fetch(websiteUrl, {
        method: 'HEAD', // faster than GET
        timeout: 10000,
        redirect: 'follow' // explicitly handle redirects
    });
    
    responseStatus = result.status;
    
    // Log redirect info if needed
    if (result.redirected) {
        console.log(`Redirected from ${websiteUrl} to ${result.url}`);
    }
    
} catch (err) {
    // Network errors only
    if (err.name === 'TypeError') {
        responseStatus = 'network_error';
    } else if (err.name === 'AbortError') {
        responseStatus = 'timeout';
    } else {
        responseStatus = 'unknown_error';
    }
}

output.set('responseStatus', responseStatus);

Key changes: HEAD requests beat GET for speed, explicit timeouts, and proper error categorization.

With 24k records, batch them in chunks of 50-100 and add delays between batches. Most servers will throttle you otherwise.

Your fetch is missing proper error handling - you’re treating network timeouts, DNS failures, and HTTP errors all the same as ‘failed’. Don’t catch everything the same way. When fetch succeeds but returns non-200 status, you can still grab result.status to see if it’s a 404, 500, etc. Only mark true network failures as ‘failed’. Fetch handles 301/302 redirects automatically unless you override it. Add a timeout option too - some URLs will hang forever and slow down your dataset processing.