Getting img URLs from a website

Bob · June 3, 2025, 10:31am

So here is an interesting one.

I would like to get relevant image URLs (gif, JPEG, WEBP, etc) from an number of websites. These websites will be different every time so I can not build a scraper.

This is the automation that i tried:

Get Page as HTML
Prompt the HTML with OpenAI with this prompt :

Review the HTML code of a concert website and identify the highest-resolution image that most likely features musicians, artists, or bands. Prioritize images with filenames, URLs, alt text, or surrounding metadata containing keywords like ‘concert,’ ‘musician,’ ‘artist,’ ‘band,’ ‘live,’ ‘performance,’ or the names of jazz musician artists or bands. Ensure the image has the largest pixel dimensions or file size and is in a preferred format such as JPEG, PNG, or WEBP. Also, give priority to images with artist names, concert dates, or other relevant context in the surrounding text. Exclude decorative images such as logos, icons, or any images not relevant to the concert content. If multiple images have equal resolution, select the first occurrence with the most relevant metadata. Extract and output only the direct URL of the single highest-resolution image, with no additional text or metadata. If no image is found meeting these criteria, output: no image found.

This doesnt work because the HTML is usually too long.

Any thoughts or other approaches that may work?

Bardeeni · June 3, 2025, 12:16pm

Hey Bob,

Glad to hear from you!

Right, an AI prompt will not handle the length. I just got an idea that you can try pulling links with RegEx. Here’s a regex you can use to extract image URLs (e.g., .jpg, .jpeg, .png, .gif, .webp) from raw HTML:

https?:\/\/[^"'\s>]+?\.(?:jpg|jpeg|png|gif|webp)

Example usage notes:

Matches most direct image URLs in HTML.
Case-insensitive (add i flag if needed depending on your tool).
Works even if URLs are inside <img src="">, inline styles, or anywhere in the HTML string.

Just tested this with an automation like this:

Let me know how it goes!

Best,
Victoria

Customer Support - bardeen.ai
Knowledge Base https://support.bardeen.ai/hc/en-us
Explore | @bardeenai | Bardeen Community

Bob · June 4, 2025, 2:00pm

Hi @victoria_bardeen, Thank you so much! This works, Bardeen Magic again.
–Bob

system · June 14, 2025, 2:00pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
FInding an image URL 🏁 Getting started.	3	9	June 22, 2025
Scraping image when deeplink is not linking to an actual image file 🏁 Getting started.	7	30	September 4, 2024
Download images from URLs 💡Share an idea	0	243	July 24, 2023
Scrape Instagram/Facebook Link & Business Email Address from a URL? ❓Help and questions	2	114	May 23, 2024
Error images /google sheets 🏁 Getting started. google-sheets , scraper	1	11	October 26, 2024

Getting img URLs from a website

Example usage notes:

Related topics