I’m a beginner with Zapier, utilizing it alongside a webhook for my Discord channel to pull feeds from a particular website via RSS. The issue I’m encountering is that the data I receive is in raw HTML format. My goal is to extract the source URL from the element, which is located within the elements of a
structure. I would like to understand how to properly scrape the contents from those
elements.
Can anyone give me some advice on how to approach this?
If you’re using Zapier to handle HTML content, a good approach is to leverage a custom code step within your Zap, particularly if you’re dealing with a complex HTML structure like a <table>. You can introduce the “Run JavaScript” or “Run Python” step in your workflow to parse the raw HTML input. With JavaScript, for instance, you could use a library like ‘cheerio’ to parse the HTML content and extract the desired <img> source URLs from the <td> elements. Make sure to capture the parsed results effectively and pass them through the subsequent steps in your workflow. This approach requires basic understanding of coding but it’s helpful for dealing with complex data extraction tasks in Zapier.