Content
Retrieve content by URL, self-report usage, list content catalog, and index content with tokens.
Content
Content represents the actual payload and data that you are looking for as a developer. You get content by making a request to a specific webpage. Every webpage has its own rate.
Content Intent
Content Intent represents your use case for the content you are requesting. TollBit currently supports two different intents: Content Retrieval and Indexing.
Indexing
Indexing content (sometimes called crawling) is used when bots retrieve content to build indexes or analyze content at scale. If you are scanning pages to prebuild search indexes or running NLP to understand content, you are using indexing intent.
If you plan to display the content to end users or use it as model training data, use content retrieval intent instead.
Content Retrieval
Content Retrieval allows bots to fetch content to directly show or summarize for end users (subject to the license). For example, if you are pulling the latest news for a user or researching trips, you are using content retrieval.
Get Content
Request content for a webpage. The content path is the full URL of the intended webpage (e.g. example.com/find/my/page). Use a token from Generate Content Token (Tokens API) for paid access.
Endpoint
GET /dev/v2/content/{content_path}
Base URL: https://gateway.tollbit.com
Authentication: Requires a one-time token via the TollbitToken header and your User-Agent (must contain your registered AgentID).
Request headers
-
TollbitToken(string, required)- The generated one-time token used to access data (from Generate Content Token).
-
User-Agent(string, required)- Unique User-Agent string for who is allowed to access the page. Must contain your AgentID that you previously registered.
-
Tollbit-Accept-Content(string, optional)- The MIME content type you want in the
contentresponse. Accepted:text/html,text/markdown. Default:text/markdown.
- The MIME content type you want in the
Path parameters
content_path(string, required)- The full URL of the intended webpage. URL-encode when used in the path.
Response format
{
"content": {
"header": "...",
"body": "...",
"footer": "..."
},
"metadata": {
"author": "string",
"description": "string",
"imageUrl": "string",
"modified": "string",
"published": "string",
"title": "string"
},
"rate": {
"license": { "id": "string", "licensePath": "string", "licenseType": "string", "permissions": [] },
"price": { "currency": "string", "priceMicros": 0 }
}
}Response fields
-
content(object): Sections of the page.header(string): Navigation and auxiliary info.body(string): Main content in markdown, excluding header/nav/footer.footer(string): Follow-up articles, terms, social links.
-
metadata(object): Article metadata (author,description,imageUrl,modified,published,title). -
rate(object): Rate data for this page. See Rates for details.
Example request
curl --location 'https://gateway.tollbit.com/dev/v2/content/https%3A%2F%2Fexample.com%2Farticle' \
--header 'TollbitToken: eyJhbGciOiJF...' \
--header 'User-Agent: my-agent/1.0'Index Content
Request content for a webpage for indexing/crawl purposes. The path and method are the same as Get Content; the difference is the token: use a token from Generate Indexing Token (Tokens API). Content providers may allow or disallow access for indexing.
Endpoint
GET /dev/v2/content/{content_path}
Base URL: https://gateway.tollbit.com
Authentication: Same as Get Content: TollbitToken (from Generate Indexing Token) and User-Agent.
Request headers
-
TollbitToken(string, required)- The generated one-time token from Generate Indexing Token.
-
User-Agent(string, required)- Must contain your registered AgentID.
Path parameters
content_path(string, required)- The full URL of the intended webpage. URL-encode when used in the path.
Response format
Same structure as Get Content, but typically without rate (indexing use):
{
"content": {
"header": "...",
"body": "...",
"footer": "..."
},
"metadata": {
"author": "string",
"description": "string",
"imageUrl": "string",
"modified": "string",
"published": "string",
"title": "string"
}
}Example request
curl --location 'https://gateway.tollbit.com/dev/v2/content/https%3A%2F%2Fexample.com%2Farticle' \
--header 'TollbitToken: eyJhbGciOiJF...' \
--header 'User-Agent: my-agent/1.0'Self Report Usage
Self-report your usage asynchronously. Use your API key and User-Agent (cURL examples use TollbitKey).
Endpoint
POST /tollbit/dev/v2/transactions/selfReport
Base URL: https://gateway.tollbit.com
Authentication: API key via TollbitKey header; User-Agent via UserAgent header (per cURL example).
Request headers
-
TollbitKey(string, required)- Your API key.
-
UserAgent(string, required)- Your registered AgentID / user agent.
Request body (JSON)
-
idempotencyId(string, required)- A unique ID you generate for idempotency when reporting usage. Useful for deduplication and retries.
-
usage(array of objects, required)- List of usage entries. Each entry:
url(string, required): The URL you wish to report usage on.timesUsed(integer, required): Number of times you used this URL.licenseType(string, required): One ofON_DEMAND_LICENSE,ON_DEMAND_FULL_USE_LICENSE,CUSTOM_LICENSE. ForCUSTOM_LICENSEincludelicenseCuid(string) with the license ID.
- List of usage entries. Each entry:
Response format
The API returns an array of transaction results, one per usage entry:
[
{
"url": "https://example.com/article",
"perUnitPriceMicros": 5000,
"totalUsePriceMicros": 10000,
"currency": "USD",
"license": {
"cuid": "ji73lwbqjnhfgpu1bk3sjzj1",
"licenseType": "ON_DEMAND_LICENSE",
"licensePath": "<license_url>",
"permissions": [{ "name": "PARTIAL_USE" }]
}
}
]Example request
curl --location 'https://gateway.tollbit.com/tollbit/dev/v2/transactions/selfReport' \
--header 'UserAgent: my-agent/1.0' \
--header 'TollbitKey: YOUR_API_KEY_HERE' \
--header 'Content-Type: application/json' \
--data '{
"idempotencyId": "1234",
"usage": [
{
"url": "https://example.com/article",
"timesUsed": 2,
"licenseType": "ON_DEMAND_LICENSE"
}
]
}'List Content Catalog
Paginate through a flattened sitemap of a particular website.
Endpoint
GET /dev/v2/content/{base_url}/catalog/list
Base URL: https://gateway.tollbit.com
Authentication: API key via TollbitKey header (per cURL example).
Request headers
TollbitKey(string, required)- Your API key.
Path parameters
base_url(string, required)- The URL root domain you want to list the content catalog from (e.g.
example.com). URL-encode when used in the path.
- The URL root domain you want to list the content catalog from (e.g.
Optional query parameters
-
pageToken(string, optional)- Pagination token to fetch the next page (from previous response).
-
pageSize(integer, optional)- Number of items per page (if supported).
Response format
{
"pageToken": "example_token",
"pages": [
{
"lastMod": "2025-12-06T02:17:08Z",
"pageUrl": "https://example.com/page-1",
"propertyId": "example_id"
},
{
"lastMod": "2025-12-06T02:11:49Z",
"pageUrl": "https://example.com/page-2",
"propertyId": "example_id"
}
]
}Response fields
-
pageToken(string): Pagination token to fetch the next page. Omit or empty when no more pages. -
pages(array): Pages in the flattened sitemap, ordered by last modified date (if present) descending.lastMod(string): Last modified date (ISO 8601).pageUrl(string): Full page URL.propertyId(string): Property identifier.
Example request
curl --location 'https://gateway.tollbit.com/dev/v2/content/example.com/catalog/list' \
--header 'TollbitKey: YOUR_API_KEY_HERE'Error responses
- 400 Bad Request: Invalid request (e.g. invalid token, invalid parameters, access not allowed).
- 401 Unauthorized: Missing or invalid
TollbitKeyor invalid/missing token.
Error responses follow the standard format with detail, instance, status, title, and type.
Notes
- Get Content vs Index Content: Same path and method; the token type (content token vs indexing token) determines the intent. Indexing access may be restricted by the content provider.
- License types:
ON_DEMAND_LICENSE,ON_DEMAND_FULL_USE_LICENSE,CUSTOM_LICENSE. ForCUSTOM_LICENSEincludelicenseCuidwhere required (e.g. Self Report Usage).
Updated 5 days ago