SearchCans

API Documentation

Complete guide to SearchCans SERP API and Reader API

API Overview - Dual Engine Platform

SearchCans provides two powerful APIs for AI applications:

Quick Start

Get started with SearchCans API in minutes:

  1. Sign up for a free account
  2. Get your API key from the Dashboard
  3. Make your first API call

Authentication

SearchCans API uses Bearer Token authentication. Include your API key in the request header:

Authorization: Bearer YOUR_API_KEY
            

SERP API - Google & Bing Search API

Endpoint

POST https://www.searchcans.com/api/search
GET https://www.searchcans.com/api/search

Request Parameters

Parameter Type Required Description
s string Yes Search keyword
d number No Timeout in milliseconds, default 10000. Recommended: 20000ms for production
w number No Maximum API wait time (ms), default 10000. Recommended: 3000-5000ms
Note: The request continues to occupy a Parallel Lane while waiting.
Tip: Set client-side timeout to 25s+ to prevent premature disconnects.
p number No Page number, default 1

Request Examples

curl -X POST https://www.searchcans.com/api/search \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "s": "SpaceX Starship",
    "t": "google",
    "p": 1,
    "d": 20000,
    "w": 3000
  }'

Response Example

{
  "code": 0,
  "msg": "Success",
  "data": [
    {
      "title": "Search result title",
      "url": "https://example.com",
      "content": "Search result summary content..."
    }
  ]
}
                    

Production-Ready Code Example

Batch keyword search tool with automatic retry and result saving.

Python - Batch Search Tool
import requests
import json
import time
import os
from datetime import datetime

# Configuration
API_KEY = "YOUR_API_KEY"
API_URL = "https://www.searchcans.com/api/search"
KEYWORDS_FILE = "keywords.txt"  # One keyword per line
OUTPUT_DIR = "serp_results"
SEARCH_ENGINE = "google"  # or "bing"
MAX_RETRIES = 3

def load_keywords(filepath):
    """Load keywords from file"""
    if not os.path.exists(filepath):
        print(f" Error: {filepath} not found")
        return []
    
    keywords = []
    with open(filepath, 'r', encoding='utf-8') as f:
        for line in f:
            keyword = line.strip()
            if keyword and not keyword.startswith('#'):
                keywords.append(keyword)
    return keywords

def search_keyword(keyword, page=1):
    """Search single keyword with SERP API"""
    headers = {
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    }
    
    payload = {
        "s": keyword,
        "t": SEARCH_ENGINE,
        "d": 20000,  # 20s server timeout
        "p": page
    }
    
    try:
        response = requests.post(API_URL, headers=headers, json=payload, timeout=30)
        result = response.json()
        
        if result.get("code") == 0:
            print(f" Success: {len(result.get('data', []))} results")
            return result
        else:
            print(f" Failed: {result.get('msg')}")
            return None
            
    except Exception as e:
        print(f" Error: {e}")
        return None

def search_with_retry(keyword):
    """Search with automatic retry"""
    for attempt in range(MAX_RETRIES):
        if attempt > 0:
            print(f"  Retry {attempt}/{MAX_RETRIES-1}...")
            time.sleep(2)
        
        result = search_keyword(keyword)
        if result:
            return result
    
    print(f"  Failed after {MAX_RETRIES} attempts")
    return None

def save_result(keyword, result, output_dir):
    """Save search result"""
    # Create safe filename
    safe_name = "".join(c if c.isalnum() or c in (' ', '-', '_') else '_' for c in keyword)
    safe_name = safe_name[:50]
    
    # Save as individual JSON
    json_file = os.path.join(output_dir, f"{safe_name}.json")
    with open(json_file, 'w', encoding='utf-8') as f:
        json.dump(result, f, ensure_ascii=False, indent=2)
    
    # Also save to summary JSONL
    jsonl_file = os.path.join(output_dir, "all_results.jsonl")
    with open(jsonl_file, 'a', encoding='utf-8') as f:
        record = {
            "keyword": keyword,
            "timestamp": datetime.now().isoformat(),
            "result": result
        }
        f.write(json.dumps(record, ensure_ascii=False) + "
")
    
    print(f"  Saved: {safe_name}.json")

def main():
    """Main execution"""
    print(" SearchCans SERP API Batch Search Tool")
    print("=" * 60)
    
    # Load keywords
    keywords = load_keywords(KEYWORDS_FILE)
    if not keywords:
        return
    
    total = len(keywords)
    print(f" Loaded {total} keywords
")
    
    # Create output directory
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    output_dir = f"{OUTPUT_DIR}_{timestamp}"
    os.makedirs(output_dir, exist_ok=True)
    print(f" Output: {output_dir}/
")
    
    # Process each keyword
    completed = 0
    for index, keyword in enumerate(keywords, 1):
        print(f"[{index}/{total}] Keyword: {keyword}")
        
        result = search_with_retry(keyword)
        
        if result:
            save_result(keyword, result, output_dir)
            
            # Extract and display URLs
            urls = [item.get("url", "") for item in result.get("data", [])]
            if urls:
                print(f"  {len(urls)} URLs found")
                for i, url in enumerate(urls[:3], 1):
                    print(f"     {i}. {url[:70]}...")
            
            completed += 1
        
        # Concurrency Control (Free tier has 1 lane, so we process serially)
        if index < total:
            time.sleep(1)
    
    # Summary
    print("
" + "=" * 60)
    print(f" Summary: {completed}/{total} successful")
    print(f" Results saved to: {output_dir}/")

if __name__ == "__main__":
    main()
              
Usage:
  1. Create keywords.txt with one keyword per line
  2. Replace YOUR_API_KEY with your actual API key
  3. Run: python serp_search.py
  4. Results saved as individual JSON files + summary JSONL

Features: Automatic retry on failure, progress tracking, batch result export.

Related Guides & Tutorials

Also Available:

  • Reader API →
  • Turn any URL into clean, LLM-ready Markdown. Perfect for RAG pipelines and AI training data.

Reader API - URL to Markdown Conversion

Why Reader API? The LLM-Ready Solution

Our URL to Markdown solution provides clean, structured, LLM-ready Markdown - the universal format for AI applications:

LLM-ready Markdown output - Perfect for RAG and AI agents
Lossless conversion - Preserves all semantic information
Structured data extraction - Title, author, date, images, links
Enterprise-grade at $0.56/1K - cost-effective and scalable

→ Why Markdown is the universal language for AI applications

Endpoint

POST https://www.searchcans.com/api/url GET https://www.searchcans.com/api/url

Request Parameters

Parameter Type Required Description
s string Yes URL to extract content from
t string Yes Type, fixed value: url
w number No Wait time after opening URL (milliseconds), default 3000. Recommended: 5000ms+ for slow JavaScript-heavy sites
Note: The request continues to occupy a Parallel Lane while waiting.
d number No Maximum wait time for API (milliseconds), default 20000. Recommended: 20000-30000ms for complex pages
b boolean No Use browser rendering, default true. Set to true for JavaScript-heavy sites (SPAs, React, Vue apps)
proxy number No Proxy Pool: 0 disabled (default), 1 Shared Pool (+2 credits), 2 Datacenter (+5 credits), 3 Residential (+10 credits). Try standard mode first; escalate proxy tier only when needed.
Migration note: In earlier versions, proxy: 1 activated Bypass Mode (5 credits). That behavior now maps to proxy: 2 (Datacenter). Update your integration accordingly.
file number No Document Parsing Coming Soon
Set to 1 to parse document files (.pdf, .doc, .docx, .xls, .xlsx) into Markdown. +10 credits.
image number No Page Screenshot Coming Soon
0 disabled (default), 1 capture above-the-fold viewport, 2 capture full-page screenshot. Returns base64 image in response. +10 credits.
proxyUrl string No Custom Proxy Coming Soon
Provide your own proxy URL (e.g., http://user:pass@host:port). No additional credits.
country string No Geo Targeting Coming Soon
ISO 3166-1 alpha-2 country code (e.g., "US", "JP", "DE"). Routes request through a geo-located IP. +1 credit.
captcha number No CAPTCHA Solving Coming Soon
Set to 1 to enable automatic CAPTCHA resolution. +10 credits.
waitCSS string[] No Wait for Selector Coming Soon
Array of CSS selectors. The engine waits until all specified elements appear in the DOM before extraction. No additional credits.
excludeCSS string[] No Exclude Selector Coming Soon
Array of CSS selectors to remove from the page before content extraction. Useful for stripping ads, navbars, or cookie banners. No additional credits.
cookies object[] No Inject Cookies Coming Soon
Array of cookie objects [{name, value, domain}] to inject before page load. Useful for authenticated content extraction. No additional credits.
mode number No Render Mode Coming Soon
0 Auto (default), 1 Fast (skip JS rendering), 2 DOM Ready (DOMContentLoaded), 3 Page Load (load event), 4 Full Render, 5 Network Idle. Controls rendering depth. No additional credits.

Proxy Pool & Access Control

When standard requests are blocked, escalate through our multi-tier proxy infrastructure. Each tier offers progressively higher success rates on restricted sites.

Proxy Tiers

Tier Parameter Credits Use Case
Standard proxy: 0 2 credits Default — no proxy
Shared Pool proxy: 1 +2 credits Light anti-bot protection
Datacenter proxy: 2 +5 credits Medium restrictions, 98% success
Residential proxy: 3 +10 credits Heavy anti-bot, highest success rate

Migration Note

In earlier versions, proxy: 1 activated Bypass Mode (5 credits). That behavior now maps to proxy: 2 (Datacenter). Please update your integration if you were using the previous binary flag.

Best Practice

Always start with standard mode (proxy: 0). Only escalate to a higher proxy tier when you receive an error. This approach maximizes success while minimizing credit consumption.

Example: Progressive Proxy Escalation (Python)

# Start with standard mode (2 credits)
payload = {
    "s": "https://example.com",
    "t": "url",
    "b": True,
    "w": 3000,
    "d": 20000
}

response = requests.post(api_url, json=payload, headers=headers)
result = response.json()

# Escalate through proxy tiers if needed
if result.get("code") != 0:
    for tier in [1, 2, 3]:  # Shared → Datacenter → Residential
        payload["proxy"] = tier
        response = requests.post(api_url, json=payload, headers=headers)
        result = response.json()
        if result.get("code") == 0:
            break

Advanced Extraction Coming Soon

The following capabilities are under active development and will be released as add-on parameters to the Reader API endpoint.

Document Parsing

Parse PDF, Word, and Excel files directly into clean Markdown.

Parameter: file: 1 | +10 credits

Page Screenshot

Capture above-the-fold or full-page screenshots as base64.

Parameter: image: 1|2 | +10 credits

CAPTCHA Solving

Automatic CAPTCHA resolution for protected pages.

Parameter: captcha: 1 | +10 credits

Geo Targeting

Route requests through country-specific IPs (ISO 3166 codes).

Parameter: country: "US" | +1 credit

Render Mode

Control rendering depth: Fast, DOM Ready, Page Load, Full Render, or Network Idle.

Parameter: mode: 0-5 | free

Fine-Grained Control

CSS selectors, cookie injection, and custom proxy support.

Parameters: waitCSS, excludeCSS, cookies, proxyUrl | free

Request Examples

curl -X POST https://www.searchcans.com/api/url \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "s": "https://www.searchcans.com/?utm_source=api_docs&utm_medium=demo_code&utm_campaign=reader_api_test",
    "t": "url",
    "w": 3000,
    "d": 20000,
    "b": true
  }'

Response Example

{
  "code": 0,
  "msg": "Success",
  "data": {
    "title": "SearchCans: Google SERP API ($0.56/1K)",
    "description": "The only SERP API with Parallel Lanes. Scrape Google/Bing & convert URLs to Markdown.",
    "markdown": "[![SearchCans Logo](https://www.searchcans.com/logo/logo.svg)](https://www.searchcans.com/)\n\n# SERP API + Reader API\n\nOur dual-engine platform offers a complete solution for AI Agents.\n\n## Key Features\n\n* **Enterprise-Grade Infrastructure**: 99.99% Uptime SLA at $0.56 per 1,000 searches.\n* **URL to Markdown**: Converts messy web pages into clean structured data.\n\n[Get Started Free](https://www.searchcans.com/register/)\n\n...(content truncated for brevity)",
    "html": "<!DOCTYPE html><html lang=\"en-US\">...</html>"
  },
  "requestId": "req_7f8a9b2c3d"
}
                

Parameter Usage Tips

  • w (wait time): 3000ms (Default) is suitable for most modern websites. Increase to 5000ms+ for heavy SPAs (React/Vue) that load slowly.
  • d (timeout): Set to 20000-30000ms for complex pages with slow loading. If you encounter timeout errors, increase this value.
  • b (browser mode):
    • true: Full JavaScript rendering, slower but complete (2-5s). Recommended for SPAs and dynamic content.
    • false: Fast static extraction, may miss JS-generated content (0.5-1s). Use for simple HTML pages.

Production-Ready Example with Error Handling

Python
import requests
import json
import time

def extract_content(url, retry=3):
    """
    Extract content with optimal settings and error handling
    """
    api_url = "https://www.searchcans.com/api/url"
    headers = {
        "Authorization": f"Bearer YOUR_API_KEY",
        "Content-Type": "application/json"
    }
    
    payload = {
        "s": url,
        "t": "url",
        "w": 3000,    # Wait 3s for JS (use 5000ms+ for heavy sites)
        "d": 20000,   # 20s server timeout
        "b": True     # Enable browser mode
    }
    
    for attempt in range(retry):
        try:
            # Set timeout higher than 'd' parameter (30s) to avoid premature disconnection
            response = requests.post(
                api_url, 
                headers=headers, 
                json=payload, 
                timeout=35
            )
            result = response.json()
            
            if result["code"] == 0:
                # Parse the data (may be string or object)
                data = result["data"]
                if isinstance(data, str):
                    data = json.loads(data)
                
                return {
                    "title": data.get("title", ""),
                    "markdown": data.get("markdown", ""),
                    "html": data.get("html", ""),
                    "description": data.get("description", "")
                }
            else:
                print(f"API Error: {result.get('msg')}")
                if attempt < retry - 1:
                    time.sleep(2)  # Wait before retry
                    continue
                return None
                
        except requests.exceptions.Timeout:
            print(f"Timeout on attempt {attempt + 1}/{retry}")
            if attempt < retry - 1:
                time.sleep(2)
                continue
            return None
        except Exception as e:
            print(f"Error: {str(e)}")
            return None
    
    return None

# Usage Example
if __name__ == "__main__":
    url = "https://example.com/article"
    content = extract_content(url)
    
    if content:
        print(f"[SUCCESS] Title: {content['title']}")
        print(f"[INFO] Content length: {len(content['markdown'])} characters")
        
        # Save to file
        with open("output.md", "w", encoding="utf-8") as f:
            f.write(f"# {content['title']}\n\n")
            f.write(content['markdown'])
        print("[SUCCESS] Saved to output.md")
    else:
        print("[ERROR] Extraction failed")
              
Complete Data Structure

The Reader API returns a rich data structure with multiple formats:

{
  "title": "Article Title",           // Page title
  "description": "Meta description",  // SEO description
  "markdown": "# Clean content...",   // LLM-ready Markdown
  "html": "<html>...</html>"          // Full HTML source
}

Save in Multiple Formats:

Markdown (.md)

Perfect for LLM training, RAG pipelines, and documentation

HTML (.html)

Full page source for archiving and analysis

JSON (.json)

Structured data with all metadata for databases

Pro Tip: Save all three formats for maximum flexibility. Markdown for AI, HTML for backup, JSON for metadata.

Common Errors & Solutions

Error: "Timeout" or "Request timeout"
  • • Increase d parameter to 25000-30000ms
  • • Check if target website is accessible
  • • Set your code's timeout higher than d parameter
Error: "Data parsing failed"
  • • Check if response.data is a string, use JSON.parse()
  • • Validate the response structure before accessing fields
Issue: "Missing content or empty markdown"
  • • Set b: true to enable browser rendering
  • • Increase w to 5000ms for heavy JavaScript sites
  • Note: Some websites have advanced anti-scraping protection. For sites with high-level anti-bot measures, we cannot guarantee 100% success rate in content extraction.

Related Guides & Tutorials

Also Available:

  • SERP API (Search API) →
  • Get real-time search results from Google and Bing. Perfect for SEO tools and market research.

Response Codes & Error Handling

All API responses include a code field in the JSON body to indicate the request status. Understanding these codes helps you handle errors effectively.

Code Meaning Description
200 Success Request processed successfully.
1010 Lane Limit Exceeded Concurrency limit reached. Solution: Pause briefly (100-500ms) and retry (Client-side Queueing).
1001 Response Timeout Request exceeded wait time. Solution: Increase `d` parameter (e.g., 20000ms).
401 Unauthorized Missing or invalid API key. Check Authorization header.
403 Forbidden Quota exceeded or insufficient permissions. Check Dashboard.
443 Connection Error SSL/Network handshake failed. Verify network connectivity.
10054 Connection Reset Remote host closed connection. Solution: Retry request.
1009 Endpoint Lane Limit Too many requests to this specific endpoint. Reduce parallelism here.
1006 User Frequency Limit Burst speed too high. Add small delay between requests.
1004 API Frequency Limit Global API protection triggered. Retry later.
1003 IP Frequency Limit IP request rate too high. Rotate IPs or slow down.
1005 IP Endpoint Limit IP rate limit for this endpoint.
1002 System Busy System under high load. Please retry later.
500 Internal Server Error Unexpected server error. Retry or contact support.
9999 No Results Found Google hasn't returned any results for this query. Try different keywords or check if the query is too specific.

Pro Tip: Handling Concurrency Limits (1010)

If you receive error 1010, it means your AI agent is running faster than your plan's Parallel Lanes allow. You have two options:

Option A: Client-side Queueing (Free)
  • Don't Stop: This is not a fatal error.
  • Retry: Pause for 100ms–500ms and retry.
  • Result: Fully utilizes 100% of your lane capacity without dropping requests.
Option B: Lane Stacking (Scale Up)
  • Buy More Lanes: Purchase additional plans to increase total concurrency.
  • Additive: 1× $99 (3) + 1× $597 (22) = 25 Lanes.
  • Result: Eliminate 1010 errors entirely for your workload. See Pricing.

Best Practices for Error Handling

  • Always check the code field before processing data
  • Implement retry logic with exponential backoff for timeout errors (1001)
  • Log error codes and messages for debugging
  • Set appropriate timeout values based on your use case (SERP API: 10-15s, Reader API: 15-30s)

Technical Support

If you encounter any issues, please contact us:

Related Resources