CosmosQL
Guide

Aggregations

Complete guide to using aggregation functions in CosmosQL - count, sum, average, min, max, and group by operations with full type safety.

Complete guide to using aggregation functions in CosmosQL - count, sum, average, min, max, and group by operations with full type safety.

Table of Contents


Overview

CosmosQL provides a complete set of aggregation functions that are:

  • Fully type-safe - TypeScript infers exact result types
  • SQL-compatible - Translates to efficient Cosmos DB SQL
  • Partition-aware - Optimized for Cosmos DB's partitioning
  • Composable - Combine multiple aggregations in one query

Available Aggregations

OperationDescriptionWorks With
_countCount documentsAll fields
_sumSum valuesNumber fields
_avgAverage valuesNumber fields
_minFind minimumAll fields
_maxFind maximumAll fields

Count Operations

Basic Count

Count documents in a partition:

import { createClient, container, field } from 'cosmosql';

const orderSchema = container('orders', {
  id: field.string(),
  customerId: field.string(),
  amount: field.number(),
  status: field.string(),
}).partitionKey('customerId');

const client = createClient({
  connectionString: process.env.COSMOS_CONNECTION_STRING!,
  database: 'mydb',
});

const db = await client.withContainers({ orders: orderSchema });

// Count orders for a customer
const count = await db.orders.count({
  partitionKey: 'customer-123',
});
// Result: 42

Count with Filters

Add WHERE conditions to count specific documents:

// Count only completed orders
const completedCount = await db.orders.count({
  partitionKey: 'customer-123',
  where: {
    status: 'completed',
  },
});

// Count with complex filters
const recentHighValueCount = await db.orders.count({
  partitionKey: 'customer-123',
  where: {
    status: 'completed',
    amount: { gte: 100 },
    createdAt: { gte: new Date('2024-01-01') },
  },
});

Cross-Partition Count

⚠️ Use sparingly - Cross-partition queries are expensive in Cosmos DB.

// Count ALL orders across all customers
const totalOrders = await db.orders.count({
  enableCrossPartitionQuery: true,
});

// Cross-partition count with filter
const allCompletedOrders = await db.orders.count({
  enableCrossPartitionQuery: true,
  where: { status: 'completed' },
});

Count Specific Fields

Count non-null values in specific fields:

const result = await db.orders.aggregate({
  partitionKey: 'customer-123',
  _count: {
    select: {
      amount: true,        // Count orders with amount
      discount: true,      // Count orders with discount (might be optional)
      trackingNumber: true,
    },
  },
});

// Result type is inferred:
// {
//   _count: {
//     amount: number;
//     discount: number;
//     trackingNumber: number;
//   }
// }

Aggregate Operations

Single Aggregation

// Sum all order amounts for a customer
const result = await db.orders.aggregate({
  partitionKey: 'customer-123',
  _sum: {
    amount: true,
  },
});

// Result: { _sum: { amount: 1250.50 | null } }

Multiple Aggregations

Combine multiple operations in one query (more efficient than separate queries):

const stats = await db.orders.aggregate({
  partitionKey: 'customer-123',
  _count: true,                    // Total count
  _sum: { amount: true },          // Total revenue
  _avg: { amount: true },          // Average order value
  _min: { amount: true },          // Smallest order
  _max: { amount: true },          // Largest order
});

// Result type (fully inferred):
// {
//   _count: number;
//   _sum: { amount: number | null };
//   _avg: { amount: number | null };
//   _min: { amount: number | null };
//   _max: { amount: number | null };
// }

console.log(`
  Orders: ${stats._count}
  Total Revenue: $${stats._sum.amount}
  Average Order: $${stats._avg.amount}
  Range: $${stats._min.amount} - $${stats._max.amount}
`);

Aggregations with Filters

// Stats for completed orders only
const completedStats = await db.orders.aggregate({
  partitionKey: 'customer-123',
  where: { status: 'completed' },
  _count: true,
  _sum: { amount: true },
  _avg: { amount: true },
});

// Year-to-date statistics
const ytdStats = await db.orders.aggregate({
  partitionKey: 'customer-123',
  where: {
    createdAt: { gte: new Date('2024-01-01') },
  },
  _sum: { amount: true, tax: true, shipping: true },
  _count: true,
});

Multiple Field Aggregations

const productSchema = container('products', {
  id: field.string(),
  category: field.string(),
  price: field.number(),
  cost: field.number(),
  stock: field.number(),
  weight: field.number(),
}).partitionKey('category');

const stats = await db.products.aggregate({
  partitionKey: 'electronics',
  _sum: {
    stock: true,    // Total inventory
    price: true,    // Total value at retail
    cost: true,     // Total cost basis
  },
  _avg: {
    price: true,    // Average price
    weight: true,   // Average weight
  },
  _min: {
    price: true,    // Cheapest product
  },
  _max: {
    price: true,    // Most expensive product
  },
});

// Calculate profit margin
const totalProfit = stats._sum.price! - stats._sum.cost!;
const margin = (totalProfit / stats._sum.price!) * 100;
console.log(`Profit margin: ${margin.toFixed(2)}%`);

findMany with Aggregations

The findMany method supports aggregations, allowing you to fetch data and calculate statistics in a single, efficient query. This combines the power of data retrieval with aggregation operations.

Return Types

Without Aggregations (Backward Compatible):

// No aggregate parameter
const users = await db.users.findMany({ partitionKey: "john@example.com" });
// Type: User[]

const selectedUsers = await db.users.findMany({ 
  partitionKey: "john@example.com",
  select: { id: true, name: true }
});
// Type: Array<{ id: string, name: string }>

With Aggregations:

// With aggregate parameter
const result = await db.users.findMany({
  partitionKey: "john@example.com",
  aggregate: { _count: true }
});
// Type: { data: User[], _count: number }

Basic Usage

const result = await db.users.findMany({
  partitionKey: "john@example.com",
  where: { isActive: true },
  aggregate: {
    _count: true,
    _avg: { age: true }
  }
});

console.log(`Found ${result._count} active users`);
console.log(`Average age: ${result._avg.age}`);
console.log(`Users:`, result.data);

Multiple Aggregations

Combine data retrieval with comprehensive statistics:

const stats = await db.orders.findMany({
  partitionKey: "customer-123",
  aggregate: {
    _count: true,
    _sum: { totalAmount: true },
    _avg: { totalAmount: true },
    _min: { createdAt: true },
    _max: { createdAt: true }
  }
});

console.log(`Order Statistics:`);
console.log(`  Total orders: ${stats._count}`);
console.log(`  Total revenue: ${stats._sum.totalAmount}`);
console.log(`  Average order: ${stats._avg.totalAmount}`);
console.log(`  First order: ${stats._min.createdAt}`);
console.log(`  Latest order: ${stats._max.createdAt}`);
console.log(`  Orders: ${stats.data.length} items`);

With Filtering and Selection

const result = await db.products.findMany({
  partitionKey: "category-electronics",
  where: { 
    price: { gte: 100 },
    inStock: true 
  },
  select: { id: true, name: true, price: true },
  aggregate: {
    _count: true,
    _avg: { price: true },
    _min: { price: true },
    _max: { price: true }
  }
});

// Only selected fields in data
console.log(result.data); // Array<{ id: string, name: string, price: number }>

// All aggregations available
console.log(`Expensive products: ${result._count}`);
console.log(`Price range: ${result._min.price} - ${result._max.price}`);

Paginated Results with Total Count

Perfect for building paginated UIs with accurate page counts:

const result = await db.users.findMany({
  partitionKey: "john@example.com",
  take: 10,  // Get first 10 users
  skip: 0,
  aggregate: {
    _count: true,  // Count ALL users matching query
  }
});

console.log(`Showing ${result.data.length} of ${result._count} total users`);

// Calculate total pages
const totalPages = Math.ceil(result._count / 10);
console.log(`Page 1 of ${totalPages}`);

Cross-Partition Queries

const result = await db.products.findMany({
  where: { category: "electronics" },
  enableCrossPartitionQuery: true,
  aggregate: {
    _count: true,
    _avg: { price: true }
  }
});

console.log(`Electronics across all partitions:`);
console.log(`  Total products: ${result._count}`);
console.log(`  Average price: ${result._avg.price}`);

Behavior & Performance

Parallel Execution:

  • Data query and aggregation query execute in parallel
  • Both queries use the same where, partitionKey, and enableCrossPartitionQuery filters
  • Results are combined into a single response object

Filtering Rules:

  • where clause: Applied to both data and aggregation queries
  • orderBy, take, skip: Applied only to data query (aggregations count all matching records)
  • select: Applied only to data query (aggregations work on full records)

Comparison with Separate Methods

// Before: Two separate queries
const users = await db.users.findMany({ partitionKey: "john@example.com" });
const stats = await db.users.aggregate({
  partitionKey: "john@example.com",
  _count: true,
  _avg: { age: true }
});

// After: Single efficient query
const result = await db.users.findMany({
  partitionKey: "john@example.com",
  aggregate: {
    _count: true,
    _avg: { age: true }
  }
});
// result.data contains users
// result._count and result._avg contain aggregations

When to Use Each Approach:

Use findMany with aggregate:

  • When you need both data and aggregations
  • For paginated results with total counts
  • When filtering is shared between data and aggregations

Use separate aggregate() method:

  • When you only need aggregations (no data)
  • For complex group-by operations
  • When data and aggregations have different filters

All Aggregation Operations

Count Records:

const result = await db.users.findMany({
  partitionKey: "john@example.com",
  aggregate: { _count: true }
});
// result._count: number

// Count specific fields (non-null values)
const result = await db.users.findMany({
  partitionKey: "john@example.com",
  aggregate: {
    _count: { select: { name: true, email: true } }
  }
});
// result._count: { name: number, email: number }

Sum, Average, Min, Max:

const result = await db.orders.findMany({
  partitionKey: "customer-123",
  aggregate: {
    _sum: { amount: true, tax: true },
    _avg: { amount: true },
    _min: { amount: true, createdAt: true },
    _max: { amount: true, createdAt: true }
  }
});

// All typed correctly based on field types
// number fields return number | null
// date fields return Date | null

Full TypeScript Support

const result = await db.users.findMany({
  partitionKey: "john@example.com",
  aggregate: {
    _count: true,
    _avg: { age: true },
    _sum: { score: true }
  }
});

// Type: { 
//   data: User[], 
//   _count: number,
//   _avg: { age: number | null },
//   _sum: { score: number | null }
// }

Group By Operations

Group By is the most powerful aggregation feature - it splits data into groups and performs aggregations on each group.

Group by Single Field

const salesSchema = container('sales', {
  id: field.string(),
  storeId: field.string(),
  category: field.string(),
  amount: field.number(),
  quantity: field.number(),
}).partitionKey('storeId');

// Sales by category
const categoryStats = await db.sales.groupBy({
  by: 'category',
  enableCrossPartitionQuery: true,
  _count: true,
  _sum: { amount: true, quantity: true },
  _avg: { amount: true },
});

// Result type (fully inferred):
// Array<{
//   category: string;
//   _count: number;
//   _sum: { amount: number | null; quantity: number | null };
//   _avg: { amount: number | null };
// }>

// Use the results
for (const group of categoryStats) {
  console.log(`${group.category}:
    Sales: $${group._sum.amount}
    Units: ${group._sum.quantity}
    Avg: $${group._avg.amount}
  `);
}

Group by Multiple Fields

Use an array to group by multiple fields:

// Sales by store AND category
const detailedStats = await db.sales.groupBy({
  by: ['storeId', 'category'] as const,  // 'as const' for better type inference
  enableCrossPartitionQuery: true,
  _count: true,
  _sum: { amount: true },
});

// Result type:
// Array<{
//   storeId: string;
//   category: string;
//   _count: number;
//   _sum: { amount: number | null };
// }>

// Find best performing store-category combination
const topPerformer = detailedStats.reduce((max, group) => 
  (group._sum.amount || 0) > (max._sum.amount || 0) ? group : max
);

console.log(`Top: ${topPerformer.storeId} - ${topPerformer.category}`);

Group By with Ordering

Sort results by aggregation values:

// Top categories by revenue
const topCategories = await db.sales.groupBy({
  by: 'category',
  enableCrossPartitionQuery: true,
  _count: true,
  _sum: { amount: true },
  orderBy: {
    _sum_amount: 'desc',  // Order by aggregated field
  },
  take: 10,  // Top 10
});

// Categories with most transactions
const popularCategories = await db.sales.groupBy({
  by: 'category',
  enableCrossPartitionQuery: true,
  _count: true,
  orderBy: {
    _count: 'desc',
  },
  take: 5,
});

Group By with Pagination

// Paginated category report
const page1 = await db.sales.groupBy({
  by: 'category',
  enableCrossPartitionQuery: true,
  _sum: { amount: true },
  orderBy: { category: 'asc' },
  take: 20,
  skip: 0,
});

const page2 = await db.sales.groupBy({
  by: 'category',
  enableCrossPartitionQuery: true,
  _sum: { amount: true },
  orderBy: { category: 'asc' },
  take: 20,
  skip: 20,
});

Group By with Filters

Combine WHERE clauses with GROUP BY:

const userSchema = container('analytics', {
  id: field.string(),
  userId: field.string(),
  eventType: field.string(),
  country: field.string(),
  timestamp: field.date(),
  duration: field.number(),
}).partitionKey('userId');

// Events by type for a specific user in the last 30 days
const recentEvents = await db.analytics.groupBy({
  by: 'eventType',
  partitionKey: 'user-123',
  where: {
    timestamp: { gte: new Date(Date.now() - 30 * 24 * 60 * 60 * 1000) },
  },
  _count: true,
  _avg: { duration: true },
});

// Cross-partition: Events by country for premium users
const countryStats = await db.analytics.groupBy({
  by: 'country',
  enableCrossPartitionQuery: true,
  where: {
    eventType: 'purchase',
  },
  _count: true,
  _sum: { duration: true },
});

Convenience Methods

For simple single-field aggregations, use convenience methods:

sum()

// Quick sum - cleaner than aggregate()
const totalRevenue = await db.orders.sum('amount', {
  partitionKey: 'customer-123',
});
// Returns: number | null

// With filter
const completedRevenue = await db.orders.sum('amount', {
  partitionKey: 'customer-123',
  where: { status: 'completed' },
});

avg()

// Average order value
const avgOrderValue = await db.orders.avg('amount', {
  partitionKey: 'customer-123',
});
// Returns: number | null

// Average rating for a product
const avgRating = await db.reviews.avg('rating', {
  partitionKey: 'product-456',
  where: { verified: true },
});

min()

// Find cheapest product
const cheapestPrice = await db.products.min('price', {
  partitionKey: 'electronics',
});
// Returns: number | null

// Earliest order date
const firstOrder = await db.orders.min('createdAt', {
  partitionKey: 'customer-123',
});
// Returns: Date | null (type inferred from field!)

max()

// Most expensive product
const highestPrice = await db.products.max('price', {
  partitionKey: 'electronics',
});

// Latest activity
const lastSeen = await db.users.max('lastLoginAt', {
  partitionKey: 'tenant-1',
});
// Returns: Date | null

Type Safety

CosmosQL provides complete type inference for all aggregation operations.

Automatic Type Inference

const orderSchema = container('orders', {
  id: field.string(),
  customerId: field.string(),
  amount: field.number(),
  tax: field.number().optional(),
  status: field.string(),
  createdAt: field.date(),
}).partitionKey('customerId');

// TypeScript knows exact types
const stats = await db.orders.aggregate({
  partitionKey: 'customer-123',
  _count: true,
  _sum: { amount: true, tax: true },
  _min: { createdAt: true },
  _max: { createdAt: true },
});

// ✅ All these are correctly typed:
const count: number = stats._count;
const total: number | null = stats._sum.amount;
const taxTotal: number | null = stats._sum.tax;
const firstOrder: Date | null = stats._min.createdAt;
const lastOrder: Date | null = stats._max.createdAt;

// ❌ TypeScript errors:
// stats._sum.status;  // Error: 'status' is not a number field
// stats._avg.createdAt;  // Error: 'createdAt' is not a number field

Group By Type Inference

// Single field grouping
const result1 = await db.orders.groupBy({
  by: 'status',
  enableCrossPartitionQuery: true,
  _count: true,
  _sum: { amount: true },
});
// Type: Array<{ status: string; _count: number; _sum: { amount: number | null } }>

// Multiple field grouping
const result2 = await db.orders.groupBy({
  by: ['customerId', 'status'] as const,
  enableCrossPartitionQuery: true,
  _count: true,
});
// Type: Array<{ customerId: string; status: string; _count: number }>

// Access fields with full type safety
for (const group of result1) {
  console.log(group.status);      // ✅ string
  console.log(group._count);      // ✅ number
  console.log(group._sum.amount); // ✅ number | null
}

Optional Fields

const productSchema = container('products', {
  id: field.string(),
  category: field.string(),
  price: field.number(),
  salePrice: field.number().optional(),  // Optional field
}).partitionKey('category');

// TypeScript knows salePrice might be undefined
const result = await db.products.aggregate({
  partitionKey: 'electronics',
  _sum: { price: true, salePrice: true },  // ✅ Works with optional fields
  _avg: { salePrice: true },
});

// All aggregation results are nullable
const total: number | null = result._sum.price;
const saleTotal: number | null = result._sum.salePrice;

Type-Safe Field Selection

// Only number fields can be summed/averaged
await db.orders.aggregate({
  partitionKey: 'customer-123',
  _sum: {
    amount: true,    // ✅ number field
    // status: true,  // ❌ Error: 'status' is not a number
  },
  _avg: {
    amount: true,    // ✅ number field
    // createdAt: true,  // ❌ Error: dates can't be averaged
  },
});

// Any field can be used with min/max/count
await db.orders.aggregate({
  partitionKey: 'customer-123',
  _min: {
    amount: true,     // ✅ number
    status: true,     // ✅ string
    createdAt: true,  // ✅ date
  },
});

Real-World Examples

E-Commerce Analytics Dashboard

// Schema
const orderSchema = container('orders', {
  id: field.string(),
  customerId: field.string(),
  status: field.string(),
  totalAmount: field.number(),
  itemCount: field.number(),
  discount: field.number().optional(),
  createdAt: field.date(),
}).partitionKey('customerId');

// Customer lifetime value
async function getCustomerLTV(customerId: string) {
  const stats = await db.orders.aggregate({
    partitionKey: customerId,
    where: { status: 'completed' },
    _count: true,
    _sum: { totalAmount: true, itemCount: true, discount: true },
    _avg: { totalAmount: true },
    _min: { createdAt: true },
    _max: { createdAt: true },
  });

  return {
    totalOrders: stats._count,
    lifetimeValue: stats._sum.totalAmount || 0,
    totalItems: stats._sum.itemCount || 0,
    totalSavings: stats._sum.discount || 0,
    avgOrderValue: stats._avg.totalAmount || 0,
    firstOrderDate: stats._min.createdAt,
    lastOrderDate: stats._max.createdAt,
    customerSince: stats._min.createdAt 
      ? Math.floor((Date.now() - stats._min.createdAt.getTime()) / (1000 * 60 * 60 * 24))
      : 0,
  };
}

// Sales by status report
async function getSalesStatusReport() {
  const statusReport = await db.orders.groupBy({
    by: 'status',
    enableCrossPartitionQuery: true,
    _count: true,
    _sum: { totalAmount: true, itemCount: true },
    _avg: { totalAmount: true },
  });

  return statusReport.map(group => ({
    status: group.status,
    count: group._count,
    revenue: group._sum.totalAmount || 0,
    items: group._sum.itemCount || 0,
    avgValue: group._avg.totalAmount || 0,
  }));
}

// Top customers
async function getTopCustomers(limit: number = 10) {
  const customers = await db.orders.groupBy({
    by: 'customerId',
    enableCrossPartitionQuery: true,
    _count: true,
    _sum: { totalAmount: true },
    orderBy: { _sum_totalAmount: 'desc' },
    take: limit,
  });

  return customers.map(c => ({
    customerId: c.customerId,
    orders: c._count,
    spent: c._sum.totalAmount || 0,
  }));
}

Analytics Platform

const eventSchema = container('events', {
  id: field.string(),
  userId: field.string(),
  eventType: field.string(),
  page: field.string(),
  duration: field.number(),
  timestamp: field.date(),
}).partitionKey('userId');

// User engagement metrics
async function getUserEngagement(userId: string, days: number = 30) {
  const since = new Date(Date.now() - days * 24 * 60 * 60 * 1000);
  
  const metrics = await db.events.aggregate({
    partitionKey: userId,
    where: { timestamp: { gte: since } },
    _count: true,
    _sum: { duration: true },
    _avg: { duration: true },
  });

  const eventBreakdown = await db.events.groupBy({
    by: 'eventType',
    partitionKey: userId,
    where: { timestamp: { gte: since } },
    _count: true,
    _avg: { duration: true },
  });

  return {
    totalEvents: metrics._count,
    totalTime: metrics._sum.duration || 0,
    avgSessionTime: metrics._avg.duration || 0,
    eventTypes: eventBreakdown,
  };
}

// Platform-wide statistics
async function getPlatformStats() {
  // Events by type
  const eventTypes = await db.events.groupBy({
    by: 'eventType',
    enableCrossPartitionQuery: true,
    where: {
      timestamp: { gte: new Date(Date.now() - 24 * 60 * 60 * 1000) },
    },
    _count: true,
    _avg: { duration: true },
    orderBy: { _count: 'desc' },
  });

  // Page engagement
  const pages = await db.events.groupBy({
    by: 'page',
    enableCrossPartitionQuery: true,
    where: {
      timestamp: { gte: new Date(Date.now() - 24 * 60 * 60 * 1000) },
    },
    _count: true,
    _sum: { duration: true },
    orderBy: { _count: 'desc' },
    take: 10,
  });

  return { eventTypes, topPages: pages };
}

Inventory Management

const productSchema = container('products', {
  id: field.string(),
  warehouseId: field.string(),
  category: field.string(),
  stock: field.number(),
  reorderPoint: field.number(),
  cost: field.number(),
  price: field.number(),
}).partitionKey('warehouseId');

// Warehouse inventory summary
async function getWarehouseSummary(warehouseId: string) {
  const summary = await db.products.aggregate({
    partitionKey: warehouseId,
    _count: true,
    _sum: { stock: true, cost: true, price: true },
  });

  const categoryBreakdown = await db.products.groupBy({
    by: 'category',
    partitionKey: warehouseId,
    _count: true,
    _sum: { stock: true },
    _avg: { price: true },
  });

  const inventoryValue = (summary._sum.cost || 0) * (summary._sum.stock || 0);
  const retailValue = (summary._sum.price || 0) * (summary._sum.stock || 0);

  return {
    totalProducts: summary._count,
    totalStock: summary._sum.stock || 0,
    inventoryValue,
    retailValue,
    potentialProfit: retailValue - inventoryValue,
    byCategory: categoryBreakdown,
  };
}

// Low stock alerts
async function getLowStockItems(warehouseId: string) {
  const items = await db.products.findMany({
    partitionKey: warehouseId,
    where: {
      stock: { lte: field.number() }, // Stock <= reorderPoint
    },
  });

  // Get summary stats
  const stats = await db.products.aggregate({
    partitionKey: warehouseId,
    where: {
      stock: { lte: field.number() },
    },
    _count: true,
    _sum: { stock: true },
  });

  return {
    items,
    totalLowStockItems: stats._count,
    totalLowStockUnits: stats._sum.stock || 0,
  };
}

Performance & Best Practices

✅ DO: Use Partition Keys

Always provide a partition key when possible:

// ✅ GOOD - Single partition query (fast & cheap)
const count = await db.orders.count({
  partitionKey: 'customer-123',
});

// ⚠️ AVOID - Cross-partition query (slow & expensive)
const count = await db.orders.count({
  enableCrossPartitionQuery: true,
});

✅ DO: Combine Multiple Aggregations

More efficient to get multiple aggregations in one query:

// ✅ GOOD - Single query
const stats = await db.orders.aggregate({
  partitionKey: 'customer-123',
  _count: true,
  _sum: { amount: true },
  _avg: { amount: true },
});

// ❌ BAD - Three separate queries
const count = await db.orders.count({ partitionKey: 'customer-123' });
const sum = await db.orders.sum('amount', { partitionKey: 'customer-123' });
const avg = await db.orders.avg('amount', { partitionKey: 'customer-123' });

✅ DO: Use WHERE Clauses

Filter data to reduce computation:

// ✅ GOOD - Only aggregate what you need
const completedStats = await db.orders.aggregate({
  partitionKey: 'customer-123',
  where: { status: 'completed' },
  _sum: { amount: true },
});

// ⚠️ AVOID - Aggregating everything then filtering
const allStats = await db.orders.aggregate({
  partitionKey: 'customer-123',
  _sum: { amount: true },
});
// Then filtering in application code

✅ DO: Use Pagination for Large Results

// ✅ GOOD - Paginate large group by results
async function* getAllGroups() {
  let skip = 0;
  const pageSize = 100;
  
  while (true) {
    const page = await db.sales.groupBy({
      by: 'category',
      enableCrossPartitionQuery: true,
      _sum: { amount: true },
      take: pageSize,
      skip,
    });
    
    if (page.length === 0) break;
    yield* page;
    skip += pageSize;
  }
}

⚠️ AVOID: Unnecessary Cross-Partition Queries

// ❌ BAD - Expensive cross-partition query
const total = await db.orders.sum('amount', {
  enableCrossPartitionQuery: true,
});

// ✅ BETTER - Store aggregates separately if needed frequently
// Use a separate container or materialized views

⚠️ AVOID: Aggregating Large Date Ranges Without Filters

// ❌ BAD - Scanning entire history
const allTimeStats = await db.orders.aggregate({
  enableCrossPartitionQuery: true,
  _sum: { amount: true },
});

// ✅ BETTER - Limit to relevant time period
const recentStats = await db.orders.aggregate({
  enableCrossPartitionQuery: true,
  where: {
    createdAt: { gte: new Date('2024-01-01') },
  },
  _sum: { amount: true },
});

Understanding NULL Results

Aggregations return null when there are no rows:

const stats = await db.orders.aggregate({
  partitionKey: 'new-customer',
  _sum: { amount: true },
  _avg: { amount: true },
});

// stats._sum.amount might be null (no orders)
// Always handle null case:
const total = stats._sum.amount ?? 0;
const average = stats._avg.amount ?? 0;

// Or use nullish coalescing in display:
console.log(`Total: $${stats._sum.amount || 0}`);

Cosmos DB RU Costs

Aggregation operations consume Request Units (RUs):

  • Single Partition: ~2-5 RUs (very efficient)
  • Cross-Partition: 5-100+ RUs depending on data size
  • Group By: Higher RU cost, especially cross-partition

Monitor your RU consumption and optimize queries accordingly.


Advanced Patterns

Combining Aggregations with Find

// Get both detail and summary
async function getOrderReport(customerId: string) {
  const [orders, stats] = await Promise.all([
    db.orders.findMany({
      partitionKey: customerId,
      orderBy: { createdAt: 'desc' },
      take: 10,
    }),
    db.orders.aggregate({
      partitionKey: customerId,
      _count: true,
      _sum: { amount: true },
      _avg: { amount: true },
    }),
  ]);

  return { recentOrders: orders, summary: stats };
}

Calculating Percentages

const statusBreakdown = await db.orders.groupBy({
  by: 'status',
  enableCrossPartitionQuery: true,
  _count: true,
});

const total = statusBreakdown.reduce((sum, g) => sum + g._count, 0);

const withPercentages = statusBreakdown.map(group => ({
  status: group.status,
  count: group._count,
  percentage: ((group._count / total) * 100).toFixed(2) + '%',
}));

Time-Series Aggregations

// Daily sales for last 30 days
async function getDailySales(days: number = 30) {
  const results = [];
  
  for (let i = 0; i < days; i++) {
    const date = new Date();
    date.setDate(date.getDate() - i);
    const startOfDay = new Date(date.setHours(0, 0, 0, 0));
    const endOfDay = new Date(date.setHours(23, 59, 59, 999));

    const stats = await db.orders.aggregate({
      enableCrossPartitionQuery: true,
      where: {
        createdAt: { gte: startOfDay, lte: endOfDay },
        status: 'completed',
      },
      _count: true,
      _sum: { amount: true },
    });

    results.push({
      date: startOfDay,
      orders: stats._count,
      revenue: stats._sum.amount || 0,
    });
  }

  return results;
}

Error Handling

import { CosmosError } from 'cosmosql';

try {
  const stats = await db.orders.aggregate({
    // partitionKey missing!
    _sum: { amount: true },
  });
} catch (error) {
  if (error instanceof CosmosError) {
    if (error.statusCode === 429) {
      // Rate limited - retry with backoff
      console.log('Rate limited, retry after:', error.retryAfter);
    } else if (error.code === 'PARTITION_KEY_REQUIRED') {
      // Missing partition key
      console.error('Must provide partition key or enable cross-partition query');
    }
  }
  throw error;
}

Migration from Raw Cosmos DB

If you're migrating from raw Cosmos DB SDK:

Before (Raw SDK)

const querySpec = {
  query: 'SELECT COUNT(1) as count, SUM(c.amount) as total FROM c WHERE c.customerId = @customerId',
  parameters: [{ name: '@customerId', value: 'customer-123' }],
};

const { resources } = await container.items.query(querySpec).fetchAll();
const result = resources[0];

After (CosmosQL)

const result = await db.orders.aggregate({
  partitionKey: 'customer-123',
  _count: true,
  _sum: { amount: true },
});
// Result is fully typed!

Summary

CosmosQL's aggregation functions provide:

  1. Type Safety - Full TypeScript inference for all operations
  2. Clean API - Intuitive, composable operations
  3. Performance - Optimized queries with partition awareness
  4. Flexibility - From simple counts to complex group by operations
  5. Developer Experience - IntelliSense, autocompletion, and compile-time checks

Start with simple count() operations, then explore aggregate() for multiple operations, and finally groupBy() for advanced analytics. Always consider partition keys and cross-partition costs when designing your queries.

For more examples, see the API reference or join our community for support.