cuery.seo.traffic
=================

.. py:module:: cuery.seo.traffic

.. autoapi-nested-parse::

   Domain traffic analysis and aggregation using Similarweb data via Apify actors.

   This module provides comprehensive website traffic analysis capabilities by integrating
   with Similarweb data through Apify's web scraping infrastructure. It enables large-scale
   collection of domain-level traffic metrics including visitor counts, engagement metrics,
   traffic sources, and global rankings. The module is particularly useful for competitive
   analysis, market research, and understanding traffic patterns across multiple domains.

   Key features include batch processing of domain URLs for efficient data collection,
   automatic domain extraction and normalization from various URL formats, traffic source
   breakdown (direct, search, social, referrals), and aggregation functions for keyword-based
   traffic analysis. The module handles rate limiting and error recovery to ensure reliable
   data collection, making it suitable for analyzing hundreds or thousands of domains
   in SEO and competitive intelligence workflows.


Classes
-------

.. autoapisummary::

   cuery.seo.traffic.TrafficConfig


Functions
---------

.. autoapisummary::

   cuery.seo.traffic.domain
   cuery.seo.traffic.fetch_batch
   cuery.seo.traffic.fetch_domain_traffic
   cuery.seo.traffic.normalize_traffic
   cuery.seo.traffic.aggregate_traffic
   cuery.seo.traffic.keyword_traffic


Module Contents
---------------

.. py:class:: TrafficConfig(/, **data)

   Bases: :py:obj:`cuery.utils.Configurable`


   Configuration for fetching SERP data using Apify Google Search Scraper actor.


   .. py:attribute:: batch_size
      :type:  int
      :value: 100


      Number of keywords to fetch in a single batch.


   .. py:attribute:: apify_token
      :type:  str | pathlib.Path | None
      :value: None


      Path to Apify API token file.
      If not provided, will use the `APIFY_TOKEN` environment variable.


.. py:function:: domain(url)

   Clean domain name.


.. py:function:: fetch_batch(urls, client, **kwargs)
   :async:


   Process a single batch of keywords.


.. py:function:: fetch_domain_traffic(urls, cfg)
   :async:


   Fetch traffic data for a DataFrame of organic SERP results.

   Note that free similarweb crawlers only fetch data at the domain level, not for specific URLs!

   Actor: https://apify.com/tri_angle/fast-similarweb-scraper


.. py:function:: normalize_traffic(df)

   Process traffic data into flat DataFrame with relevant data only.


.. py:function:: aggregate_traffic(df, by)

   Aggregate traffic data for each keyword's top domains.

   Note: for now we don't keep similarweb's categorization of domains or top keyword data.


.. py:function:: keyword_traffic(kwds, urls, cfg)
   :async:


   Fetch and aggregate traffic data for lists of urls associated with given keywords.