Integrations

Overview

We provide integrations with a variety of different platforms to enable analytics and bot forwarding to your TollBit subdomain.

Integrating Analytics with TollBit

We provide a way for you to forward logs to our platform so that we can provide analytics on bot traffic and more. This provides a great way to consolidate your logs and gain fast insights about what your bot traffic looks like. It is essential that the logs forwarded to us are server side logs from your CDN/edge, as client side javascript plugins are only triggered if the javascript is run, which most bots do not do. Server side logs will ensure that we have the cleanest view of the traffic hitting your site.

Setting up Bot Paywall with TollBit

Once you have TollBit set up for your website, you are now able to set up bot deterrence settings on your existing cloud cybersecurity platform to forward known bot traffic to your new tollbit subdomain.

At a high level, you are simply modifying your existing bot blocking solution to, instead of returning an error response if it detects a bad bot, to instead forward that traffic over to us through your tollbit subdomain.

The example solutions we provide here assume that you currently do not have bot detection and blocking in place. It should be straightforward to use these examples to understand how you can update your current blocking solutions to instead forward detected bots to your tollbit subdomain. Forwarded bots will see a message like the following:

{
  "message": "You are not authorized to access this content without a valid TollBit Token. Please follow this URL to find out more.",
  "url": "https://tollbit.com"
}

Next Steps

Please visit the tabbed layout on the left to directly view the steps to integrate TollBit analytics and bot paywall with your chosen partner.

Fastly

Follow these steps to set up an integration into our platform if you use Fastly.

Get Service ID and API Key from Fastly

Go to your Fastly Dashboard and pick the domain associated with your property.

Right under your service name, you’ll see an alphanumeric string. It should be the same alphanumeric string that completes the URL string for the page. See the highlight below for reference.

Next, hover over Account on the main navigation bar on the left and choose API tokens > personal tokens.

Setup Integration in TollBit

Go to your TollBit dashboard and pick the Integrations tab in the main navigation menu. Input your Fastly API key and service ID in the form and click Save.

Enable Analytics

Ensure that you have saved your Fastly API key and service ID in the integration settings. Once that is saved, within the same page, click on “Enable” next to the Analytics section.

Enable Bot Paywall

Ensure that you have saved your Fastly API key and service ID in the integration settings. Once that is saved, within the same page, toggle on “Block” for each agent you would like to forward to your TollBit subdomain.

Scrolling further down on the page allows you to “Block” all bots, which would redirect all listed bots on the page to forward to your TollBit subdomain.

Note: If you have used our legacy integration of Fastly (using VCL scripts), you should automatically see the updates transition into the new UI.

Fastly (legacy)

This is the documentation for the legacy Fastly integration that involves implementing VCL scripts to enable TollBit analytics and bot paywall. VCL scripts can allow for additional customizations for implementing analytics and bot forwarding. Please reach out to team@tollbit.com if you'd like to discuss this implementation route considering your use case.

Create a new Logging Configuration

Go to your Fastly Dashboard and pick the correct domain. Click “Edit Configuration”, and clone your current configuration. This saves a new configuration version as a draft, and allows you to rollback if necessary. This should bring you to a new screen. On the sidebar, scroll down until you see Logging and click on that. Then, click “Create Endpoint”.

Configure your logs to be sent to our logging endpoint

Find the HTTP logging endpoint and click “Create endpoint”. You can set the name to anything descriptive (e.g. tollbit-prod). Keep the placement option as the default selection. Make sure your log format is exactly as follows, without extra trailing spaces or newlines:

{ "timestamp": "%{strftime(\{"%Y-%m-%dT%H:%M:%S%z"\}, time.start)}V", "geo_country": "%{client.geo.country_name}V", "geo_city": "%{client.geo.city}V", "geo_postal_code":"%{client.geo.postal_code}V", "geo_latitude":"%{client.geo.latitude}V", "geo_longitude":"%{client.geo.longitude}V", "host": "%{if(req.http.Fastly-Orig-Host, req.http.Fastly-Orig-Host, req.http.Host)}V", "url": "%{json.escape(req.url)}V", "request_method": "%{json.escape(req.method)}V", "request_protocol": "%{json.escape(req.proto)}V", "request_referer": "%{json.escape(req.http.referer)}V", "request_user_agent": "%{json.escape(req.http.User-Agent)}V", "request_latency":"%{time.elapsed.usec}V", "response_state": "%{json.escape(fastly_info.state)}V", "response_status": %{std.itoa(resp.status)}V, "response_reason": %{if(resp.response, "%22"+json.escape(resp.response)+"%22", "null")}V, "response_body_size": %{resp.body_bytes_written}V, "fastly_server": "%{json.escape(server.identity)}V", "fastly_is_edge": %{if(fastly.ff.visits_this_service == 0, "true", "false")}V, "signature": "%{req.http.signature}V", "signature_agent": "%{req.http.signature-agent}V", "signature_input": "%{req.http.signature-input}V" }

Finally, set the URL to https://log.tollbit.com/log.

Ensure that your Requests are Authenticated

Go into Advanced Options and set the “Custom header name” field to “TollbitKey”. You must set the customer header value to your secret key. Log into your TollBit portal and go into the API key tab and copy your secret key. Paste it into the “Custom header value” field with no trailing spaces. Keep all the other settings as default, scroll to the bottom, and save.

Once you are ready to publish these changes, click the “Activate” button. Keep in mind that if you have other unpublished changes in Fastly, this may also publish those as well.

Fastly Bot Paywall

Fastly allows you to set up redirectly using VCL snippets. In this document, we will go over setting up forwarding requests from known bots to your tollbit subdomain.

Go to the Deliver tab and select the domain you wish to add bot forwarding to. On the right side of the screen, click the Edit configuration button and choose to clone your current active version.

On the left hand sidebar, click "VCL Snippets".

Create a snippet and name it something like tollbit-bot-forwarding-recv. This is the VCL code that will detect if a bot is using one of our known bad user agents, and will forward it to your subdomain. Put the following logic into the snippet. Make sure that the placement of the snippet is within the recv subroutine.

Copy and paste the following code block into the VCL input field and save. Don't worry, this VCL script will not actually apply until you activate the current Fastly version that you are editing.

if (req.http.user-agent ~ "(?i)chatgpt-user|perplexitybot|gptbot|anthropic-ai|ccbot|claude-web|claudebot|cohere-ai|youbot|diffbot|oai-searchbot|meta-externalagent|timpibot|amazonbot|bytespider|perplexity-user") {
  if (std.prefixof(req.http.host, "www.")) {
    set req.http.host = std.replace_prefix(req.http.host, "www.", "tollbit.");
  } else {
    set req.http.host = "tollbit." + req.http.host;
  }
  error 600;
}

Next, create another VCL snippet. This time, call it something like tollbit-bot-forwarding-error. This time, make sure that the placement is within the error subroutine.

Paste the following code in this snippet. This will set the correct headers and status code for the redirection done in the previous snippet.

if (obj.status == 600) {
  set obj.status = 307;
  set obj.response = "Temporary Redirect";
  set obj.http.Location = req.protocol + "://" req.http.host + req.url;
  set obj.http.cache-control = "max-age=0";
  return (deliver);
}

This should now be all you need to forward known bot traffic to your tollbit subdomain! You can activate these changes by clicking "Apply".

CloudFlare

We provide a way for all CloudFlare customers, regardless of plan, to forward HTTP logs to our platform for analytics. We recommend this method over others like LogPush as CloudFlare Enterprise is not required to create workers, and you have much more control over how logs are sent.

CloudFlare Enterprise

If you are on the Enterprise plan, you should have access to CloudFlare's Logpush feature. You may already be pushing logs to an S3, R2 or GCP bucket. If this is the case, we are able to ingest your logs from where they are already being stored.

One small update you may need to make is adding the location response header and the signature-agent, signature-input and signature request headers to the logs. Follow these steps in Cloudflare's documentation to add this header. You will want to select "Response Header" as the field type and type in location, and select "Request Header" and type in signature-agent, signature-input and signature.

If your logs are already being sent to an S3 bucket, add the following IAM policy to your bucket to enable TollBit to process your logs:

{
  "Version": "2025-05-07",
  "Statement": [
    {
      "Sid": "AllowTollbitAccountsAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::339712821696:root",
          "arn:aws:iam::654654318267:root"
        ]
      },
      "Action": ["s3:GetObject*", "s3:ListBucket*"],
      "Resource": [
        "arn:aws:s3:::YOUR-BUCKET-NAME",
        "arn:aws:s3:::YOUR-BUCKET-NAME/*"
      ]
    }
  ]
}

Once you have done that, reach out to team@tollbit.com and provide the path to your logs in your bucket and we will be able to quickly enable TollBit analytics for your site.

If you are not on Enterprise, read on to set up a worker to forward logs.

Create new Worker

If you already have an existing worker that is intercepting requests for your site, or you already set up a worker in the Bot Paywall section below, you will need to integrate this logging code with that worker. If you just have a bot deterrence worker set up, see that section to get a code snippet that also pushes logs.

Log into your CloudFlare Dashboard and click on the "Compute (Workers)" tab to have it open as a dropdown, and click on "Workers & Pages".

Click on the blue "Create" button near the top.

This will take you to a get started screen. Choose the option to create a hello world worker, as we were be overwriting all the worker code in the next few steps anyways.

Next will be a screen where you can name your worker and see the initial code that it will be running. Set the name to something TollBit related such as tollbit-worker, and click deploy. We will be modifying the worker code shortly.

Updating the Worker Code

Once your worker has finished deploying, click "Edit code".

In the worker.js file, delete everything and copy the following code over exactly, making sure to replace YOUR_SECRET_KEY_HERE with the secret key you can find in your portal.

const CF_APP_VERSION = '1.0.0'

const tollbitLogEndpoint = 'https://log.tollbit.com/log'
const tollbitToken = 'YOUR_SECRET_KEY_HERE'

const sleep = (ms) => {
  return new Promise((resolve) => {
    setTimeout(resolve, ms)
  })
}

const makeid = (length) => {
  let text = ''
  const possible = 'ABCDEFGHIJKLMNPQRSTUVWXYZ0123456789'
  for (let i = 0; i < length; i += 1) {
    text += possible.charAt(Math.floor(Math.random() * possible.length))
  }
  return text
}

const buildLogMessage = (request, response) => {
  const logObject = {
    timestamp: new Date().toISOString(),
    client_ip: '', // worker only is able to get cloudflare edge IP, leaving blank
    geo_country: request.cf['country'],
    geo_city: request.cf['city'],
    geo_postal_code: request.cf['postalCode'],
    geo_latitude: request.cf['latitude'],
    geo_longitude: request.cf['longitude'],
    host: request.headers.get('host'),
    url: request.url.replace('https://' + request.headers.get('host'), ''),
    request_method: request.method,
    request_protocol: request.cf['httpProtocol'],
    request_user_agent: request.headers.get('user-agent'),
    request_latency: null, // cloudflare does not have latency information
    request_referer: request.headers.get('referer'),
    response_state: null,
    response_status: response.status,
    response_reason: response.statusText,
    response_body_size: response.contentLength,
    signature: request.headers.get('signature'),
    signature_agent: request.headers.get('signature-agent'),
    signature_input: request.headers.get('signature-input'),
  }
  return logObject
}

// Batching
const BATCH_INTERVAL_MS = 20000 // 20 seconds
const MAX_REQUESTS_PER_BATCH = 500 // 500 logs
const WORKER_ID = makeid(6)

let workerTimestamp

let batchTimeoutReached = true
let logEventsBatch = []

// Backoff
const BACKOFF_INTERVAL = 10000
let backoff = 0

async function addToBatch(body, event) {
  logEventsBatch.push(body)

  if (logEventsBatch.length >= MAX_REQUESTS_PER_BATCH) {
    event.waitUntil(postBatch(event))
  }

  return true
}

async function handleRequest(event) {
  const { request } = event

  const response = await fetch(request)
  const rCf = request.cf
  delete rCf.tlsClientAuth
  delete rCf.tlsExportedAuthenticator

  const eventBody = buildLogMessage(request, response)
  event.waitUntil(addToBatch(eventBody, event))

  return response
}

const fetchAndSetBackOff = async (lfRequest, event) => {
  if (backoff <= Date.now()) {
    const resp = await fetch(tollbitLogEndpoint, lfRequest)
    if (resp.status === 403 || resp.status === 429) {
      backoff = Date.now() + BACKOFF_INTERVAL
    }
  }

  event.waitUntil(scheduleBatch(event))

  return true
}

const postBatch = async (event) => {
  const batchInFlight = [...logEventsBatch.map((e) => JSON.stringify(e))]
  logEventsBatch = []
  const body = batchInFlight.join('\n')
  const request = {
    method: 'POST',
    headers: {
      TollbitKey: `${tollbitToken}`,
      'Content-Type': 'application/json',
    },
    body,
  }
  event.waitUntil(fetchAndSetBackOff(request, event))
}

const scheduleBatch = async (event) => {
  if (batchTimeoutReached) {
    batchTimeoutReached = false
    await sleep(BATCH_INTERVAL_MS)
    if (logEventsBatch.length > 0) {
      event.waitUntil(postBatch(event))
    }
    batchTimeoutReached = true
  }
  return true
}

addEventListener('fetch', (event) => {
  event.passThroughOnException()

  if (!workerTimestamp) {
    workerTimestamp = new Date().toISOString()
  }

  event.waitUntil(scheduleBatch(event))
  event.respondWith(handleRequest(event))
})

Hit "Deploy" on the upper righthand corner once you are finished, and then navigate out of the editor with the little back arrow on the upper left side of the page, next to the name of the worker.

Link worker to CloudFlare HTTP Logs

Click on "Account Home" on the left pane and select the website that you would like to forward logs for, and click into it. On the left panel, click into "Worker Routes", and then click "Add route". Set the route to *.<your_site.com>/*, or a custom path if you only want to forward logs for certain URL patterns. Under workers, choose the worker that you just created.

Once you are ready, click "Save", and you are all set!

Minimizing Worker Usage

By default, the above configuration will have every request to your site run through your worker. To reduce the number of requests for workers, we must keep in mind that our analytics platform works best if we try to send logs that correspond to page views, and avoid sending logs that are for requests for static assets or javascript files.

To successfully minimize worker usage, investigate your directory structure and see if you have a common paths for static assets. For example, some CMS frameworks will have a directory similar to example.com/assets for assets. To avoid running the worker on these request paths, create a new route for your worker for that path, in this example *.example.com/assets* and example.com/assets*, and set the worker for that route to be "Empty".

Your route page will then look something like the following.

If you aren't sure which route to disable, consider running the worker on your full site and then using the top pages chart in our analytics platform to understand any routes you wish to filter out.

CloudFlare Bot Paywall

There are several levels of bot detection and forwarding that you can configure for CloudFlare, depending on whether or not you are on their Enterprise plan.

Bot Paywall on any Plan (Including Free)

Follow the steps described above (within the Cloudflare Analytics section) up until you have created a new worker. Name this working something to help you keep track of it's function (such as bot-forwarding-worker). Once you've created this worker, click into edit code and do the following to set up your forwarding worker.

If you have set up log forwarding, copy and replace your worker.js file with this code instead. Make sure that you keep your TollBit token copied over into the code.

// this is a non-exhaustive list of agents that we recommend you get started with first
// Add any other agents you would like to forward into this list.
const botList = [
  'ChatGPT-User',
  'PerplexityBot',
  'GPTBot',
  'anthropic-ai',
  'CCBot',
  'Claude-Web',
  'ClaudeBot',
  'cohere-ai',
  'YouBot',
  'Diffbot',
  'OAI-SearchBot',
  'meta-externalagent',
  'Timpibot',
  'Amazonbot',
  'Bytespider',
  'Perplexity-User',
]

const CF_APP_VERSION = '1.0.0'

const tollbitLogEndpoint = 'https://log.tollbit.com/log'
const tollbitToken = 'YOUR_SECRET_KEY_HERE'

const sleep = (ms) => {
  return new Promise((resolve) => {
    setTimeout(resolve, ms)
  })
}

const makeid = (length) => {
  let text = ''
  const possible = 'ABCDEFGHIJKLMNPQRSTUVWXYZ0123456789'
  for (let i = 0; i < length; i += 1) {
    text += possible.charAt(Math.floor(Math.random() * possible.length))
  }
  return text
}

const buildLogMessage = (request, response) => {
  const logObject = {
    timestamp: new Date().toISOString(),
    client_ip: '', // worker only is able to get cloudflare edge IP, leaving blank
    geo_country: request.cf['country'],
    geo_city: request.cf['city'],
    geo_postal_code: request.cf['postalCode'],
    geo_latitude: request.cf['latitude'],
    geo_longitude: request.cf['longitude'],
    host: request.headers.get('host'),
    url: request.url.replace('https://' + request.headers.get('host'), ''),
    request_method: request.method,
    request_protocol: request.cf['httpProtocol'],
    request_user_agent: request.headers.get('user-agent'),
    request_latency: null, // cloudflare does not have latency information
    request_referer: request.headers.get('referer'),
    response_state: null,
    response_status: response.status,
    response_reason: response.statusText,
    response_body_size: response.contentLength,
    signature: request.headers.get('signature'),
    signature_agent: request.headers.get('signature-agent'),
    signature_input: request.headers.get('signature-input'),
  }
  return logObject
}

// Batching
const BATCH_INTERVAL_MS = 20000 // 30 seconds
const MAX_REQUESTS_PER_BATCH = 500 // 500 logs
const WORKER_ID = makeid(6)

let workerTimestamp

let batchTimeoutReached = true
let logEventsBatch = []

// Backoff
const BACKOFF_INTERVAL = 10000
let backoff = 0

async function addToBatch(body, event) {
  logEventsBatch.push(body)

  if (logEventsBatch.length >= MAX_REQUESTS_PER_BATCH) {
    event.waitUntil(postBatch(event))
  }

  return true
}

async function handleRequest(event) {
  const { request } = event
  const isBotRequest = checkIfBotRequest(request)

  // if bot request, immediately forward to subdomain
  if (isBotRequest) {
    const path = request.url.replace(
      'https://' + request.headers.get('host'),
      '',
    )
    let host = request.headers.get('host') || ''
    if (host.startsWith('www.')) {
      // remove www
      host = host.slice(4)
    }
    return Response.redirect('https://tollbit.' + host + path, 302)
  } else {
    const response = await fetch(request)
    // otherwise add to log batch and return response
    const eventBody = buildLogMessage(request, response)
    event.waitUntil(addToBatch(eventBody, event))
    return response
  }
}

const fetchAndSetBackOff = async (lfRequest, event) => {
  if (backoff <= Date.now()) {
    const resp = await fetch(tollbitLogEndpoint, lfRequest)
    if (resp.status === 403 || resp.status === 429) {
      backoff = Date.now() + BACKOFF_INTERVAL
    }
  }

  event.waitUntil(scheduleBatch(event))

  return true
}

const postBatch = async (event) => {
  const batchInFlight = [...logEventsBatch.map((e) => JSON.stringify(e))]
  logEventsBatch = []
  const body = batchInFlight.join('\n')
  const request = {
    method: 'POST',
    headers: {
      TollbitKey: `${tollbitToken}`,
      'Content-Type': 'application/json',
    },
    body,
  }
  event.waitUntil(fetchAndSetBackOff(request, event))
}

const scheduleBatch = async (event) => {
  if (batchTimeoutReached) {
    batchTimeoutReached = false
    await sleep(BATCH_INTERVAL_MS)
    if (logEventsBatch.length > 0) {
      event.waitUntil(postBatch(event))
    }
    batchTimeoutReached = true
  }
  return true
}

const checkIfBotRequest = (request) => {
  const userAgent = request.headers.get('User-Agent') || ''

  for (var i = 0; i < botList.length; i++) {
    if (userAgent.toLowerCase().includes(botList[i].toLowerCase())) {
      return true
    }
  }
  return false
}

addEventListener('fetch', (event) => {
  event.passThroughOnException()

  if (!workerTimestamp) {
    workerTimestamp = new Date().toISOString()
  }

  event.waitUntil(scheduleBatch(event))
  event.respondWith(handleRequest(event))
})

This code will immediately let through anyone with a known browser, and check all other requests against a list that we will periodically update with known bad user agents.

If you have not set up log forwarding and just want to forward bot traffic, put this code in your worker.js file.

// this is a non-exhaustive list of agents that we recommend you get started with first
// Add any other agents you would like to forward into this list.
const botList = [
  'ChatGPT-User',
  'PerplexityBot',
  'GPTBot',
  'anthropic-ai',
  'CCBot',
  'Claude-Web',
  'ClaudeBot',
  'cohere-ai',
  'YouBot',
  'Diffbot',
  'OAI-SearchBot',
  'meta-externalagent',
  'Timpibot',
  'Amazonbot',
  'Bytespider',
  'Perplexity-User',
]

export default {
  fetch(request) {
    const userAgent = request.headers.get('User-Agent') || ''
    const path = request.url.replace(
      'https://' + request.headers.get('host'),
      '',
    )
    let host = request.headers.get('host') || ''
    if (host.startsWith('www.')) {
      // remove www
      host = host.slice(4)
    }
    for (var i = 0; i < botList.length; i++) {
      if (userAgent.toLowerCase().includes(botList[i].toLowerCase())) {
        return Response.redirect('https://tollbit.' + host + path, 302)
      }
    }

    // Default behaviour
    return fetch(request)
  },
}

CloudFlare Enterprise and Bot Management

If you are on Enterprise and are using Bot Management, you should have access to the bot score in the header of the request. You can replace the checkIfBotRequest function in the previous worker scripts to use something similar to the following, and you can set the BOT_SCORE_THRESHOLD to determine how strict your forwarding is. CloudFlare lists what each score range means.

const checkIfBotRequest = (request) => {
  const userAgent = request.headers.get('User-Agent') || '';

  // Check for known AI agents
  for (let i = 0; i < botList.length; i++) {
    if (userAgent.toLowerCase().includes(botList[i].toLowerCase())) {
      return true;
    }
  }

  // Check bot score
  const botScore = request.cf?.botManagement?.score;
  if (botScore !== undefined && botScore < BOT_SCORE_THRESHOLD) {
    return true;
  }

  return false;
};

Enterprise

If you have CloudFlare enterprise, you should be able to use the Bot Management product to get a bot score for each request. You can add logic in the above code's checkIfBotRequest function to also return true if the bot score is lower than a certain threshold.

Akamai

We provide a way for all Akamai customers to stream logs to our platform.

Create a Stream with DataStream 2

You will need to first create a stream by going to your Akamai Control Center. Follow these instructions on how to create your stream.

Choose Data Parameters

When choosing data parameters, make sure to parameters that cover at least everything in following sample log JSON. Also, please ensure that your log format is JSON.

{
  "version": 1,
  "streamId": "12345",
  "cp": "123456",
  "reqId": "1239f220",
  "reqTimeSec": "1573840000",
  "bytes": "4995",
  "cliIP": "128.147.28.68",
  "statusCode": "206",
  "proto": "HTTPS",
  "reqHost": "test.hostname.net",
  "reqMethod": "GET",
  "reqPath": "/path1/path2/file.ext",
  "reqPort": "443",
  "rspContentLen": "5000",
  "rspContentType": "text/html",
  "UA": "Mozilla%2F5.0+%28Macintosh%3B+Intel+Mac+OS+X+10_14_3%29",
  "tlsOverheadTimeMSec": "0",
  "tlsVersion": "TLSv1",
  "objSize": "484",
  "uncompressedSize": "484",
  "overheadBytes": "232",
  "totalBytes": "0",
  "queryStr": "param=value",
  "breadcrumbs": "//BC/%5Ba=23.33.41.20,c=g,k=0,l=1%5D",
  "accLang": "en-US",
  "cookie": "cookie-content",
  "range": "37334-42356",
  "referer": "https%3A%2F%2Ftest.referrer.net%2Fen-US%2Fdocs%2FWeb%2Ftest",
  "xForwardedFor": "8.47.28.38",
  "maxAgeSec": "3600",
  "reqEndTimeMSec": "3",
  "errorCode": "ERR_ACCESS_DENIED|fwd_acl",
  "turnAroundTimeMSec": "11",
  "transferTimeMSec": "125",
  "dnsLookupTimeMSec": "50",
  "lastByte": "1",
  "country": "IN",
  "state": "Virginia",
  "city": "HERNDON"
}

Stream to Endpoint

To forward your logs to us, follow the steps outlined here. The endpoint url that you should be streaming to is https://log.tollbit.com/log/akamai.

Select none for authentication for now, as we will be setting up custom authentication. To do so, go to "Custom header". For the content type, you can select application/json. Add a new header value with the key TollbitKey and the value as your secret key from your dashboard.

Finally, you can review and activate your stream!

Akamai Bot Paywall

Akamai allows you to set up redirection rules at the edge using Cloudlets. Specifically, they provide Edge Redirector Cloudlets that help you manage redirection using certain matching rules.

We want to first start by creating an Edge Redirector policy. Follow the documentation here to do so in accordance with how your Akamai instance is set up.

Once you have set up your policy, follow the documentation here to set up rules for your Edge Redirector. Because we want to be redirecting based on the User-Agent header, we will need to create a redirector with advance matching rules. You will want to create a match type based on the request header. The name of the header should be User-Agent, and the value should be a tab separated list of bad user agents. You can use the following list:

ChatGPT-User PerplexityBot GPTBot anthropic-ai CCBot Claude-Web ClaudeBot cohere-ai YouBot Diffbot OAI-SearchBot meta-externalagent Timpibot Amazonbot Bytespider Perplexity-User

For the operator value, use is one of without case sensitivity. These settings should let you match our known bad users agents. In the redirection rule, you can set the redirect url to your tollbit subdomain. Ensure that the path is preserved in the redirect. Akamai has an example of this in their docs and we should be able to follow it by setting the redirect url to https://tollbit.<your_site>/\2. The \2 should preserve the path.

Click save rule to save your changes, and you should be ready to activate! Follow the steps here to do so.

Vercel

To forward logs from Vercel, follow the instructions for Log Drains here.

To properly authenticate and verify your logs, you should use the endpoint https://log.tollbit.com/log/vercel.

In order to verify the endpoint through Vercel, pass the header x-vercel-tollbit-verify as a custom header, along with your organization's secret key as the custom header TollbitKey. See the following screenshots for an example of the configuration. Once you have added these headers, you should be able to click the Verify button and add your log drain.

Vercel Bot Paywall

To redirect bots to your TollBit subdomain you can use Vercel's Custom WAF rules.

Create a new rule. Set the rule to look at the User Agent and selected Matches expression. Copy the following regex as the expression to match. Feel free to modify to remove or block different bots.

(ChatGPT-User|PerplexityBot|GPTBot|anthropic-ai|CCBot|Claude-Web|ClaudeBot|cohere-ai|YouBot|Diffbot|OAI-SearchBot|meta-externalagent|Timpibot|Amazonbot|Bytespider|Perplexity-User)

Set the rule to redirect to your TollBit subdomain by changing the Then option to Redirect and copy in your TollBit subdomain. The rule should look like this when you're finished

Finally, click save rule. The change won't go into effect until you publish the change to go live.

Amazon (AWS)

Forwarding Logs with ALB

To forward logs from an ALB, follow these steps outlined in the AWS docs here.

Once you have started forwarding your logs to an S3 bucket, create an IAM policy to allow TollBit to access your logs: If your logs are already being sent to an S3 bucket, add the following IAM policy to your bucket to enable TollBit to process your logs:

{
  "Version": "2025-05-07",
  "Statement": [
    {
      "Sid": "AllowTollbitAccountsAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::339712821696:root",
          "arn:aws:iam::654654318267:root"
        ]
      },
      "Action": ["s3:GetObject*", "s3:ListBucket*"],
      "Resource": [
        "arn:aws:s3:::YOUR-BUCKET-NAME",
        "arn:aws:s3:::YOUR-BUCKET-NAME/*"
      ]
    }
  ]
}

Once you have created the policy, reach out to team@tollbit.com to coordinate with our engineering team on the rest of the TollBit Analytics setup.

To finalize your setup, we will need access to the directory in your S3 bucket where your logs are stored, along with the pattern for how the logs are stored for instance /service/logs/2024/12/04/log-file

Forwarding Logs with Cloudfront

To forward logs from Cloudfront follow these steps:

Enable standard logging for your Cloudfront distribution following the AWS docs here

Point your logs at an S3 Bucket. Note, we only currently support the default w3c, tab delimited format with the default 33 fields that are included in the logs. If you wish to use JSON and/or modify the fields that Cloudfront logs, please reach out to team@tollbit.com and we can get that set up for you.

Create the following IAM policy for your bucket to allow TollBit to process your logs: If your logs are already being sent to an S3 bucket, add the following IAM policy to your bucket to enable TollBit to process your logs:

{
  "Version": "2025-05-07",
  "Statement": [
    {
      "Sid": "AllowTollbitAccountsAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::339712821696:root",
          "arn:aws:iam::654654318267:root"
        ]
      },
      "Action": ["s3:GetObject*", "s3:ListBucket*"],
      "Resource": [
        "arn:aws:s3:::YOUR-BUCKET-NAME",
        "arn:aws:s3:::YOUR-BUCKET-NAME/*"
      ]
    }
  ]
}

Once you have started forwarding your logs to an S3 bucket, and granted TollBit access, reach out to team@tollbit.com to coordinate with our engineering team on the rest of the TollBit Analytics setup. To finalize your setup, we will need access to the directory in your S3 bucket where your logs are stored, along with the pattern for how the logs are stored for instance /service/logs/2024/12/04/log-file

AWS WAF + CloudFront Bot Paywall

You can use a combination of AWS Web ACLs and CloudFront to detect and redirect bots. This example will use a Web ACL with a WAF rule to detect bots, and then have CloudFront redirect bot traffic.

First, go to the WAF & Shield and create a new Web ACL. Ensure that the ACL being created is for CloudFront distributions. Add your existing CloudFront distribution to this ACL under the "Associated AWS resources" section of the page.

Once you've created the ACL, you can choose any rules you'd like to enable bot detection. AWS Marketplace has managed bot detection rules that you can add to your ACL. We will provide our own WAF rule as well. To use our WAF rule, select the option for using your own rules and rule groups, and use the JSON editor. Copy and paste the following rule:

{
  "Name": "cloudfront-agent-rule",
  "Priority": 0,
  "Statement": {
    "OrStatement": {
      "Statements": [
        {
          "ByteMatchStatement": {
            "SearchString": "chatgpt-user",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "perplexitybot",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "NONE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "gptbot",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "anthropic-ai",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "ccbot",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "amazonbot",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "claude-web",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "cohere-ai",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "omgilibot",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "omgili",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "youbot",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "bytespider",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "diffbot",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "oai-searchbot",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "meta-externalagent",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "timpibot",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        },
        {
          "ByteMatchStatement": {
            "SearchString": "perplexity-user",
            "FieldToMatch": {
              "SingleHeader": {
                "Name": "user-agent"
              }
            },
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "LOWERCASE"
              }
            ],
            "PositionalConstraint": "CONTAINS"
          }
        }
      ]
    }
  },
  "Action": {
    "Allow": {
      "CustomRequestHandling": {
        "InsertHeaders": [
          {
            "Name": "Bot",
            "Value": "true"
          }
        ]
      }
    }
  },
  "VisibilityConfig": {
    "SampledRequestsEnabled": true,
    "CloudWatchMetricsEnabled": true,
    "MetricName": "cloudfront-agent-rule"
  }
}

This will detect the top known AI bots. Next, for the action, be sure to choose "Allow" and to add a custom header. Ours is called bot, but feel free to make this anything unique.

Next, navigate to the CloudFront product and to the "Functions" tab. Create a new function and paste in the following javascript:

function handler(event) {
  if (event.request.headers['x-amzn-waf-bot'] !== undefined) {
    const host = event.request.headers.host.value
    const uri = event.request.uri
    const newurl = `https://tollbit.${host}${uri}`
    const response = {
      statusCode: 302,
      statusDescription: 'Found',
      headers: { location: { value: newurl } },
    }
    return response
  }
  return event.request
}

Earlier, our WAF rule had set a header called bot onto the request if it matched the rule. Amazon automatically appends x-amzn-waf- to the header, so the actual header to look for is now called x-amzn-waf-bot. If this header exists, it means that our WAF rule detected that this request is a bot request, so we now want to forward it to our tollbit subdomain. Once you are ready, save the changes and publish this code. On the publish tab, you will then need to associate this function to your existing CloudFront distribution.

Google (GCP)

Setting up Logging with CDN/Cloud Load Balancer

To set up logging for your Google Cloud Load Balancer instance (if you are using Google CDN, it should be backed by a Cloud Load Balancer), you can forward logs to a GCP Storage Bucket.

First, create a bucket you would like to use to hold the logs.

If your load balancer is backed by a non-static backend (you are using another domain or IP address as an orgin, and not a Storage Bucket), you may need to edit your load balancer's configs and enable a 100% sampling rate for backend logging.

Next, go to the Log Explorer page and on the left hand nav bar, click into "Log router".

On the top bar, click "Create sink".

Go through the sink creation flow, making sure to set the Storage Bucket you created earlier as the destination. You should set an inclusion filter to ensure that only traffic logs for your load balancer gets stored. Some fields to use for the inclusion filter could be the ID of the load balancer, the underlying domain or IP address, the URL that routes to the load balancer, etc.

Once this sink is created, you may need to wait up to an hour for logs to start appearing in your bucket. Once you've verified that this is set up correctly, please contact us at team@tollbit.com to share your bucket with us.

Google Cloud Armor Bot Paywall

Google's Cloud Armor allows you to set up some simple redirection rules for user agents.

Note that to implement the full solution, where we want to preserve the path of the content, you will need to set up a separate backend service that handles redirection that preserves path. However, you can simply just redirect to the root tollbit subdomain as well to get most of the functionaltiy.

First, navigate to Cloud Armor policies and create a new one (or add this to your existing policy). Set the default rule to allow.

Next, add more rules and select "Advanced mode". You can add preferred user agents that you want to redirect in the match rules box.

Next, select "Redirect" as the action for the rule, and if you do have a redirection backend service that preserves path, put the URL to that service. Otherwise, put the root tollbit subdomain for your site (tollbit.yoursite.com).

Save and activate your policy.

Microsoft (Azure)

Setting up Logging with Azure Front Door

To set up logging for your Azure front door, first navigate to your specific Front Door instance. On the left sidebar, open the dropdown for "Monitoring" and select the "Diagnostics settings" tab.

This will take you to a screen where you can create a new Diagnostics Setting.

Go ahead and create a new setting. Within, the settings page, selectthe options that will send all access logs to a Storage account. If you don't already have a storage bucket in place for these logs, please create one.

Once you've confirmed that logs are setting stored in your chosen bucket, please reach out to team@tollbit.com to coordinate with our engineering team on the rest of the TollBit Analytics setup. To finalize your setup, we will need access to the directory in your Storage bucket where your logs are stored, along with the pattern for how the logs are stored for instance /service/logs/2024/12/04/log-file.

Azure Front Door Bot Paywall

Azure's CDN lets you easily set up redirection rules for different bots. We'll explore how to do this in the standard tier Front Door as well as the premium tier.

Standard

Navigate to your Front Door instance and click the dropdown for "Settings" on the left navbar.

You can create a new rule set using the button at the top.

Then you can add rules for the bots you want to forward off to the TollBit Bot Paywall. We recommend the following list:

chatgpt-user, perplexitybot, gptbot, anthropic-ai, ccbot, claude-web, claudebot, cohere-ai, youbot, diffbot, oai-searchbot, meta-externalagent, timpibot, amazonbot, bytespider, perplexity-user

You can add or remove from this as befits your bot strategy. We can also keep these lowercased, since in the rules we are comparing the lowercased values. Click save to save these changes.

Once you've created this ruleset, you need to associate your Front Door route to it. Click the 3 horizontal dots to the right of the rule set, and click "Associate a route".

Choose the route to the relevant Front Door instance and go through the flow of associating this route.

Premium

If you have a Premium tier Front Door instance, contact our team at team@tollbit.com and we can connect with you and evalulate the best path forward.

Datadome

Follow these steps to set up an integration into our platform if you use Datadome. Note that enabling this integration will only allow you to forward specified bots/user agents to your TollBit subdomain. To enable analytics, please choose another integration (via another partner, API endpoint, or cloud storage bucket)

Setup integration in Datadome

On your Datadome dashboard, open the Management tab in the bottom left of the navigation bar. Choose Monetize. Under TollBit, choose “Enable”.

Under Access Controls > AI agents, you can configure each agent from the default Allow option to the Monetize option. This would forward the bot over to your TollBit subdomain going forward.

Wordpress VIP

The integration supports TollBit's AI bot monitoring, management and monetization capabilities as well as seamless connections with MCP and NLWeb.

Follow these steps to set up an integration into our platform if you use Wordpress VIP. Note that enabling this integration will only allow you to forward specified bots/user agents to your TollBit subdomain. To enable analytics, please choose another integration (via another partner, API endpoint, or cloud storage bucket).

Setup integration in WordPress VIP

Log in to the Wordpress VIP Dashboard and select the banner navigation menu item labeled “Integrations Center” at the top of the VIP Dashboard.

Select TollBit from the Integrations Center list to access the information page for TollBit. An overview of the Integration’s features and functionality is provided, as well as information about the publisher and links to support documentation. To add the Integration to an organization, select the button labeled “+ Add to Organization” located at the top of the right-hand column. If the current user has an Org admin role for more than one organization, the organization to which the Integration will be added must be selected from the auto-generated option dropdown after the button is selected.

Other

We support other methods of log ingestion besides the integrations that we listed above.

Log Sink Forwarding

You can forward your logs to our log sink endpoint at https://log.tollbit.com/log as long as you include the header TollbitKey and set the value to your secret key in your dashboard. The logs must conform to the following JSON format. Not all fields are required, but we need at least the timestamp, host, url, request_user_agent, response_status, request_referer and request_method.

{
  timestamp: string, // can be ISO 8601 format or unix timestamp
  geo_country: string,
  geo_city: string,
  geo_postal_code: string,
  geo_latitude: float,
  geo_longitude: float,
  host: string,
  url: string,
  request_method: string,
  request_protocol: string,
  request_user_agent: string,
  request_latency: int/string,
  request_referer: string,
  response_state: string,
  response_status: int/string,
  response_reason: string,
  response_body_size: int/string,
  signature: string,
  signature_agent: string,
  signature_input: string
}

When streaming the logs to the endpoint, please ensure that you are batching logs as much as possible. Each log be a single line, and should be newline separated from the other logs.

Ingesting from file storage

We are currently able to support log ingestion from S3, R2 and GCS. Please ensure that your log files are prefixed by date and time, and that the logs within the files are in JSON format (ideally as similar to the above as possible), and each log is a single line and all logs are newline separated.

If you are already forwarding logs to an S3 bucket, you can get a headstart on the setup by creating the following IAM policy for your bucket: If your logs are already being sent to an S3 bucket, add the following IAM policy to your bucket to enable TollBit to process your logs:

{
  "Version": "2025-05-07",
  "Statement": [
    {
      "Sid": "AllowTollbitAccountsAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::339712821696:root",
          "arn:aws:iam::654654318267:root"
        ]
      },
      "Action": ["s3:GetObject*", "s3:ListBucket*"],
      "Resource": [
        "arn:aws:s3:::YOUR-BUCKET-NAME",
        "arn:aws:s3:::YOUR-BUCKET-NAME/*"
      ]
    }
  ]
}

Please contact us at team@tollbit.com to complete the set up for you so you can get access to TollBit Analytics

Was this page helpful?