SEO Research and Analysis#

The cuery.seo subpackage provides a comprehensive toolkit for SEO research and analysis, integrating data from multiple sources including Google Ads API, Apify web scraping actors, and AI-powered content analysis. This unified platform enables SEO professionals to perform end-to-end research workflows including keyword discovery, SERP analysis, traffic estimation, and competitive intelligence.

Overview#

The SEO subpackage consists of several interconnected modules:

  • Keywords (cuery.seo.keywords): Google Ads API integration for keyword research and historical metrics

  • SERPs (cuery.seo.serps): SERP data collection and analysis using Apify actors

  • Traffic (cuery.seo.traffic): Domain traffic analysis using Similarweb data

  • Tasks (cuery.seo.tasks): AI-powered topic extraction and search intent classification

  • SEO (cuery.seo.seo): High-level orchestrator combining all components

Key Features#

🔍 Keyword Research
  • Generate keyword ideas from seed keywords or landing pages

  • Retrieve historical search volume and trend data

  • Geographic and language targeting

  • Competition analysis and cost-per-click metrics

📊 SERP Analysis
  • Real-time search results scraping for any keyword

  • Competitor presence tracking in search results

  • Organic result analysis with titles, URLs, and snippets

  • Brand monitoring across search results

🚦 Traffic Intelligence
  • Domain-level traffic estimation and trends

  • Traffic source breakdown (direct, search, social, referrals)

  • Engagement metrics (bounce rate, time on site, pages per visit)

  • Global ranking and competitive positioning

🤖 AI-Powered Insights
  • Automated topic extraction from SERP content

  • Search intent classification (informational, navigational, transactional, commercial)

  • Content gap analysis and opportunity identification

  • Hierarchical topic clustering for content strategy

Authentication Setup#

The SEO subpackage requires API credentials for Google Ads, Apify, and AI models. You can configure these using environment variables or configuration files.

Environment Variables#

import os
import json
import cuery.utils

# Google Ads API
os.environ["GOOGLE_ADS_DEVELOPER_TOKEN"] = "your_developer_token"
os.environ["GOOGLE_ADS_LOGIN_CUSTOMER_ID"] = "your_login_customer_id"
os.environ["GOOGLE_ADS_USE_PROTO_PLUS"] = "true"
os.environ["GOOGLE_ADS_CUSTOMER_ID"] = "your_customer_id"

# For service account authentication
with open("path/to/service-account-key.json") as f:
    json_key = json.load(f)
os.environ["GOOGLE_ADS_JSON_KEY"] = json.dumps(json_key)

# Apify for SERP and traffic data
os.environ["APIFY_TOKEN"] = "your_apify_token"

# AI model API keys
cuery.utils.set_api_keys({
    "OpenAI": "your_openai_key",
    "Google": "your_google_key",
})

Configuration Files#

Alternatively, you can pass credential file paths directly in the configuration:

from cuery.seo import SeoConfig

config = SeoConfig(
    kwd_cfg={
        "google_ads_config": "path/to/google-ads-config.yaml",
        "keywords": ["your keywords"],
    },
    serp_cfg={
        "apify_token": "path/to/apify-token.txt",
    },
    traffic_cfg={
        "apify_token": "path/to/apify-token.txt",
    }
)

Quick Start#

Here’s a simple example to get started with SEO analysis:

from cuery.seo import SeoConfig, seo

# Configure the SEO analysis
config = SeoConfig(
    kwd_cfg={
        "keywords": ["machine learning", "data science"],
        "language": "en",
        "country": "us",
        "ideas": True,
        "max_ideas": 50,
    },
    serp_cfg={
        "resultsPerPage": 10,
        "country": "us",
        "brands": ["your_brand"],
        "competitors": ["competitor1", "competitor2"],
    },
    traffic_cfg={
        "batch_size": 25,
    }
)

# Run the complete SEO analysis
result = await seo.seo_data(config)

# The result contains keyword data, SERP results, and traffic insights
print(result.head())

Keyword Research#

The keywords module provides access to Google Ads keyword planning data:

from cuery.seo.keywords import GoogleKwdConfig, keywords

# Configure keyword research
kwd_config = GoogleKwdConfig(
    keywords=["SEO tools", "keyword research"],
    language="en",
    country="us",
    ideas=True,
    max_ideas=100,
    metrics_start="2023-01",
    metrics_end="2024-12",
)

# Get keyword ideas and historical metrics
keyword_data = await keywords(kwd_config)
print(keyword_data.columns)
# Output: ['keyword', 'avg_monthly_searches', 'competition', 'low_bid', 'high_bid', ...]
  • Keyword Ideas: Generate related keywords from seed terms

  • Historical Metrics: Monthly search volume over time

  • Competition Data: Competition level and bid estimates

  • Geographic Targeting: Country and language specific data

  • Trend Analysis: Search volume trends and seasonality

SERP Analysis#

The serps module collects and analyzes search engine results:

from cuery.seo.serps import SerpConfig, serps

# Configure SERP analysis
serp_config = SerpConfig(
    resultsPerPage=20,
    country="us",
    searchLanguage="en",
    brands=["your_brand"],
    competitors=["competitor1", "competitor2"],
    topic_model="google/gemini-2.5-flash-preview-05-20",
)

# Analyze SERPs for keywords
keywords_list = ["SEO analysis", "SERP tracking"]
serp_data = await serps(keywords_list, serp_config)
  • Real-time SERP Data: Fresh search results for any keyword

  • Competitor Tracking: Monitor competitor presence in results

  • Brand Monitoring: Track your brand’s search visibility

  • AI Topic Analysis: Automated topic extraction from SERP content

  • Intent Classification: Categorize search intent automatically

Traffic Analysis#

The traffic module provides domain-level traffic insights:

from cuery.seo.traffic import TrafficConfig, keyword_traffic

# Configure traffic analysis
traffic_config = TrafficConfig(
    batch_size=50,
)

# Get traffic data for domains from SERP results
keywords_series = pd.Series(["keyword1", "keyword2"])
domain_lists = [["example.com", "competitor.com"], ["another.com"]]

traffic_data = await keyword_traffic(keywords_series, domain_lists, traffic_config)
  • Traffic Estimation: Monthly visitor estimates for domains

  • Source Breakdown: Direct, search, social, and referral traffic

  • Engagement Metrics: Bounce rate, time on site, pages per visit

  • Global Rankings: Worldwide traffic rankings

  • Competitive Analysis: Compare traffic across multiple domains

AI-Powered Analysis#

The tasks module provides intelligent analysis of SERP data:

from cuery.seo.tools import SerpTopicExtractor, SerpTopicAndIntentAssigner

# Extract topics from SERP data
topic_extractor = SerpTopicExtractor(
    model="google/gemini-2.5-flash-preview-05-20",
    n_topics=10,
    n_subtopics=5,
)

# Assign topics and intent to keywords
intent_assigner = SerpTopicAndIntentAssigner(
    model="openai/gpt-4.1-mini",
)
  • Topic Extraction: Hierarchical topic identification from SERP content

  • Intent Classification: Automatic categorization into informational, navigational, transactional, or commercial intent

  • Content Analysis: Analysis of page titles, domains, and breadcrumbs

  • Semantic Understanding: AI-powered understanding of search context

Complete Workflow Example#

Here’s a comprehensive example combining all SEO components:

import pandas as pd
from cuery.seo import SeoConfig, seo

# Define your research parameters
target_keywords = [
    "content marketing strategy",
    "SEO best practices",
    "digital marketing tools"
]

# Configure the complete SEO analysis
seo_config = SeoConfig(
    kwd_cfg={
        "keywords": target_keywords,
        "ideas": True,
        "max_ideas": 200,
        "language": "en",
        "country": "us",
        "metrics_start": "2023-01",
        "metrics_end": "2024-12",
    },
    serp_cfg={
        "resultsPerPage": 20,
        "country": "us",
        "searchLanguage": "en",
        "brands": ["your_company"],
        "competitors": [
            "hubspot",
            "semrush",
            "ahrefs",
            "moz"
        ],
        "topic_model": "google/gemini-2.5-flash-preview-05-20",
        "assignment_model": "openai/gpt-4.1-mini",
    },
    traffic_cfg={
        "batch_size": 50,
    }
)

# Run the complete analysis
results = await seo.seo_data(seo_config)

# Analyze the results
print("Keyword Analysis:")
print(f"Total keywords analyzed: {len(results)}")
print(f"Average monthly searches: {results['avg_monthly_searches'].mean():.0f}")

print("\nTop performing competitors:")
competitor_presence = results.groupby('competitor_domains')['avg_monthly_searches'].sum().sort_values(ascending=False)
print(competitor_presence.head())

print("\nSearch intent distribution:")
intent_dist = results['search_intent'].value_counts()
print(intent_dist)

# Save results for further analysis
results.to_csv("seo_analysis_results.csv", index=False)

Data Export and Integration#

Results from SEO analysis can be easily exported and integrated with other tools:

# Export to various formats
results.to_csv("seo_data.csv", index=False)
results.to_parquet("seo_data.parquet", index=False)
results.to_excel("seo_data.xlsx", index=False)

# Integration with visualization tools
import plotly.express as px

# Search volume trends
monthly_data = results.groupby('metrics_month')['avg_monthly_searches'].sum().reset_index()
fig = px.line(monthly_data, x='metrics_month', y='avg_monthly_searches',
              title='Search Volume Trends')
fig.show()

# Competitor analysis
competitor_data = results.groupby('competitor_domains')['traffic_visits_max'].sum().reset_index()
fig = px.bar(competitor_data.head(10), x='competitor_domains', y='traffic_visits_max',
             title='Top Competitors by Traffic')
fig.show()

Best Practices#

Rate Limiting and Quotas#

  • Google Ads API: Respect daily quota limits and implement exponential backoff

  • Apify: Monitor credit usage and implement batch processing for large datasets

  • AI Models: Use appropriate models for different tasks (fast models for classification, powerful models for content generation)

Data Quality#

  • Keyword Validation: Clean and normalize keywords before analysis

  • Domain Cleaning: Use the built-in domain normalization functions

  • Result Filtering: Filter out irrelevant or low-quality results

Performance Optimization#

  • Batch Processing: Use appropriate batch sizes for your use case

  • Concurrent Requests: Leverage async processing for faster execution

  • Caching: Implement caching for repeated analyses

  • Data Storage: Use efficient formats like Parquet for large datasets

Error Handling#

try:
    results = await seo.seo_data(config)
except Exception as e:
    logger.error(f"SEO analysis failed: {e}")
    # Implement fallback or retry logic

Troubleshooting#

Common Issues#

Authentication Errors
  • Verify API credentials are correctly set

  • Check quota limits and billing status

  • Ensure service accounts have proper permissions

Rate Limiting
  • Reduce batch sizes

  • Implement delays between requests

  • Use exponential backoff for retries

Data Quality Issues
  • Validate input keywords for special characters

  • Check geographic and language settings

  • Filter results based on relevance scores

Performance Issues
  • Optimize batch sizes for your infrastructure

  • Use async processing for I/O bound operations

  • Consider data sampling for large datasets

API Reference#

For detailed API documentation, see the auto-generated documentation: