Automatically Tagging Promotions and Deals with AI and Webhooks¶
This guide shows how to configure Sosse to automatically generate ⭐ Tags for web pages — such as promotions, discounts, or other valuable offers — using AI. It uses 📡 Webhooks and ChatGPT (or other AI services) to analyze page content and produce relevant tags. For demonstration purposes, this guide uses Spree, a popular open-source eCommerce platform.
Note
While this demo uses OpenAI’s ChatGPT API, other AI providers such as Claude (Anthropic), PaLM (Google), or Cohere can be used. Webhooks are fully customizable.
Warning
To use ChatGPT, you’ll need an OpenAI API key.
Set Up a Collection¶
First, define a Collection to handle the web pages you want to analyze. See the ⚡ Collections for more details.
Navigate to
⚡ Collectionsin the admin panel.Create a Collection:
Set the
Unlimited depth URL regexto match URLs of interest, ensuring that only relevant pages are processed for tag generation:^https://(www\.)?(example-shop|forum|blog)\.com/.*/(promo|deal|product).*
Under the
🌍 Browsertab, chooseFirefox, orChromiumas the browser.In the
Scriptfield, add a script to extract the HTML description of the product and save it in the document’smetadatafield. For example, for a Spree product page, you might use:const detailsText = document.getElementById('product-details-page'); const text = detailsText.innerHTML; return {metadata: {productDetails: text}};
Under the
🕑 Recurrence, select wanted crawl frequency (e.g.,Daily).
Note
While sending plain text to ChatGPT is more efficient, we opt to send the full HTML in this case. This allows ChatGPT to interpret visual elements, such as a strikethrough price, ensuring better contextual understanding of the data.
Page Crawling and Webhook Results¶
Navigate to the Crawl a new URL page and paste the product pages you want to index.
Click Add to Crawl Queue to queue the crawl jobs.
After the crawl jobs are completed, review the results on the 🔤 Documents page.
Access the full webhook response under the
📡 Webhookstab.
View the metadata generated by the webhook under the
📊 Metadatatab, which includes details like the product name and price.