Sep 1, 2024

Slack Scout

Slack Scout

Sarah Chieng

Sarah Chieng

@SarahChieng

Alex Phan

Slack Scout sends a Slack notification every time your keywords are mentioned on Twitter, Hacker News, or Reddit. Get notified whenever you, your company, or topics of interest are mentioned online.

Built with Browserbase and Val Town. Inspired by f5bot.com.

What this tutorial covers

  • Access and scrape website posts and contents using Browserbase

  • Write scheduled functions and APIs with Val Town

  • Send automated Slack messages via webhooks

Getting Started

In this tutorial, you’ll need a

  • Browserbase API key

  • Valtown account

  • Slack Webhook URL: create it here

Browserbase

Browserbase is a developer platform to run, manage, and monitor headless browsers at scale. We’ll utilize Browserbase to navigate and scrape different news sources. We’ll also use Browserbase’s Proxies to ensure we simulate authentic user interactions across multiple browser sessions.

Sign up for free to get started!

Val Town

Val Town is a platform to write and deploy Javascript. We’ll use Val Town for three things.

  1. Create HTTP scripts that run Browserbase sessions. These Browserbase sessions will execute web automation tasks, such as navigating Hacker News and Reddit.

  2. Write Cron Functions (like Cron Jobs, but more flexible) that periodically run our HTTP scripts.

  3. Store persistent data in the Val Town provided SQLite database. This built-in database allows us to track search results, so we only send slack notifications for new, unrecorded keyword mentions.

Sign up for free to get started!

Twitter (X)

For this tutorial, we’ll use the Twitter API to include Twitter post results.

You'll need to create a new Twitter account to use the API. It costs $100 / month to have a Basic Twitter Developer account.

Once you have the SLACK_WEBHOOK_URL, BROWSERBASE_API_KEY, and TWITTER_BEARER_TOKEN, input all of these as Val Town Environment Variables.

Creating our APIs

We’ll use a similar method to create scripts to search and scrape Reddit, Hacker News, and Twitter. First, let’s start with Reddit.

To create a new script, go to Val Town → New → HTTP Val. Our script will take in a keyword, and return all Reddit posts from the last day that include our keyword.

For each Reddit post, we want the output to include the URL, date published, and post title.

For example:

{ 
  source: 'Reddit', // or 'Hacker News' or 'Twitter' 
  url: '<https://www.reddit.com/r/browsers/comments/vdhge5/browserbase_launched/>'; 
  date_published: 'Aug 30, 2024'; 
  title: 'Browserbase just launched'; 
}

In our new redditSearch script, we start by importing Puppeteer and creating a Browserbase session with proxies enabled (enableProxy=true). Be sure to get your BROWSERBASE_API_KEY from your Browserbase settings.

import { PuppeteerDeno } from "<https://deno.land/x/puppeteer@16.2.0/src/deno/Puppeteer.ts>"; 
const puppeteer = new PuppeteerDeno({ productName: "chrome" }); 
const browser = await puppeteer.connect({ 
  browserWSEndpoint: `wss://connect.browserbase.com?apiKey=${apiKey}&enableProxy=true`, 
  ignoreHTTPSErrors: true, 
});

Next, we want to

  1. Navigate to Reddit and do a keyword search

  2. Scrape each resulting post

To navigate to a Reddit URL that already has our keyword and search time frame encoded, let’s write a helper function that encodes the query and sets search parameters for data collection.

function constructSearchUrl(query: string): string { 
  const encodedQuery = encodeURIComponent(query).replace(/%20/g, "+"); 
  return `https://www.reddit.com/search/?q=${encodedQuery}&type=link&t=day`; 
} 

const url = constructSearchUrl(query); 
await page.goto(url, { waitUntil: "networkidle0" });

Once we’ve navigated to the constructed URL, we can scrape each search result. For each post, we select the title, date_published, and url.

const posts = document.querySelectorAll("div[data-testid=\"search-post-unit\"]");
    return Array.from(posts).map(post => {
      const titleElement = post.querySelector("a[id^=\"search-post-title\"]");
      const timeElement = post.querySelector("faceplate-timeago");
      return {
        source: "Reddit",
        title: titleElement?.textContent?.trim() || "",
        url: titleElement?.href || "",
        date_published: timeElement?.textContent?.trim() || "",
      };
    });

// Example
{ 
  source: 'Reddit', // or 'Hacker News' or 'Twitter' 
  url: '<https://www.reddit.com/r/browsers/comments/vdhge5/browserbase_launched/>'; 
  date_published: '1 day ago';
  title: 'Browserbase just launched'; 
}

You’ll notice that Reddit posts return the date_published in the format of ‘1 day ago’ instead of ‘Aug 29, 2024.’ To make date handling more consistent, we create a reusable helper script, convertRelativeDatetoString, to convert dates to a uniform date format. We import this at the top of our redditSearch script.

import { convertRelativeDateToString } from "<https://esm.town/v/sarahxc/convertRelativeDateToString>"; 
const date_published = await convertRelativeDateToString({ relativeDate: post.date_published });

You can see the finished redditSearch code here.

We follow a similar process to create hackerNewsSearch, and use the Twitter API to create twitterSearch.

See all three scripts here:

Reddit → redditSearch

Hacker News → hackerNewsSearch

Twitter → twitterSearch

Creating the Cron Function

For our last step, we create a slackScout cron job that calls redditSearch, hackerNewsSearch, and twitterSearch that runs every hour. To create the cron file, go to Val Town → New → Cron Val.

In our new slackScout file, let’s import our HTTP scripts.

import { hackerNewsSearch } from "https://esm.town/v/alexdphan/hackerNewsSearch";
import { twitterSearch } from "https://esm.town/v/alexdphan/twitterSearch";
import { redditSearch } from "https://esm.town/v/sarahxc/redditSearch";

And create helper functions that call our Reddit, Hacker News, and Twitter HTTP scripts.

// Fetch Reddit, Hacker News, and Twitter results
async function fetchRedditResults(topic: string): Promise<Website[]> {
  return redditSearch({ query: topic });
}
  
async function fetchHackerNewsResults(topic: string): Promise<Website[]> {
  return hackerNewsSearch({
    query: topic,
    pages: 2,
    apiKey: Deno.env.get("BROWSERBASE_API_KEY") ?? "",
  });
}

async function fetchTwitterResults(topic: string): Promise<Website[]> {
  return twitterSearch({
    query: topic,
    maxResults: 10,
    daysBack: 1,
    apiKey: Deno.env.get("TWITTER_BEARER_TOKEN") ?? "",
  });
}

Next, to store our website results, let’s setup Val Town’s SQLite database. To do this, we import SQLite and write three helper functions.

  1. createTable: creates the new SQLite table

  2. isURLInTable: for each new website returned, checks if the website is already in our table

  3. addWebsiteToTable: if isURLInTable is False , we add the new website to our table

const { sqlite } = await import("https://esm.town/v/std/sqlite");
const TABLE_NAME = "slack_scout_browserbase";

// Create an SQLite table
async function createTable(): Promise<void> {
  await sqlite.execute(`
    CREATE TABLE IF NOT EXISTS ${TABLE_NAME} (
      source TEXT NOT NULL,
      url TEXT PRIMARY KEY,
      title TEXT NOT NULL,
      date_published TEXT NOT NULL
    )
  `);
}
  
async function isURLInTable(url: string): Promise<boolean> {
  const result = await sqlite.execute({
    sql: `SELECT 1 FROM ${TABLE_NAME} WHERE url = :url LIMIT 1`,
    args: { url },
  });
  return result.rows.length > 0;
}

async function addWebsiteToTable(website: Website): Promise<void> {
  await sqlite.execute({
    sql: `INSERT INTO ${TABLE_NAME} (source, url, title, date_published) 
          VALUES (:source, :url, :title, :date_published)`,
    args: website,
  })

Finally, we write a function to send a Slack notification for each new website.

async function sendSlackMessage(message: string): Promise<Response> {
  const slackWebhookUrl = Deno.env.get("SLACK_WEBHOOK_URL");
  if (!slackWebhookUrl) {
    throw new Error("SLACK_WEBHOOK_URL environment variable is not set");
  }

  const response = await fetch(slackWebhookUrl, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      blocks: [
        {
          type: "section",
          text: { type: "mrkdwn", text: message },
        },
      ],
    }),
  });

  if (!response.ok) {
    throw new Error(`Slack API error: ${response.status} ${response.statusText}

The main function initiates our workflow, calling helper functions to fetch and process data from multiple sources.


export default async function(interval: Interval): Promise<void> {
  try {
    await createTable();
    for (const topic of KEYWORDS) {
      const results = await Promise.allSettled([
        fetchHackerNewsResults(topic),
        fetchTwitterResults(topic),
        fetchRedditResults(topic),
      ]);

      const validResults = results
        .filter((result): result is PromiseFulfilledResult<Website[]> => result.status === "fulfilled")
        .flatMap(result => result.value);

      await processResults(validResults);
    }
    console.log("Cron job completed successfully.");
  } catch (error) {
    console.error("An error occurred during the cron job:", error)

And we’re done! You can see the final slackScout here.

And that’s it!

Optionally, you can use Browserbase and Val Town to create additional HTTP scripts that can monitor additional websites like Substack, Medium, WSJ, etc. Browserbase has a list of Vals you can get started with in your own projects. If you have any questions, concerns, or feedback, please let us know :)

support@browserbase.com


A 10X Better Browser
Automation Platform

A 10X Better Browser
Automation Platform

A 10X Better Browser
Automation Platform

What will you 🅱️uild?

© 2024 Browserbase. All rights reserved.