← Blog

AI Agents for E-Commerce: A Buyer's Map

AI agents for e-commerce in five categories — support, sales, ops, discovery, plus the agents arriving as buyers nobody else covers.

Max Tsygankov

Max Tsygankov · Founder, Crawloria

Published June 7, 2026 · 12 min read

Intro

Search "AI agents for e-commerce" and the top results are vendor listicles naming sixteen products with no framework for choosing between them. The names rotate by author. The claims do not. Everyone is the "leading" agent. Everyone has the "highest" containment rate. The actual decision a DTC merchant has to make (which kind of agent do I need first, and what changes after I install it) gets skipped.

This piece is a map, not a ranking. Four categories of merchant-run agents, plus the fifth category that the listicles miss: AI agents that show up at your store as buyers. Vendors are named only as category illustrations. Pricing claims are deliberately left out, because vendor tiers change quarterly and any specific number in a 2026 article is wrong by 2027. The selection framework, KPI hooks, and what doesn't work are the durable parts.

Why "AI agents" needs a category split

The term "AI agent" gets stretched to cover anything from a scripted chatbot with one fallback intent to an autonomous system that can complete a full purchase. Without a category split, comparisons go sideways. A merchant evaluating Gorgias against Constructor against ChatGPT Operator is comparing three different products solving three different problems. They cannot all be "the best AI agent for e-commerce."

A working definition: an AI agent does multi-step work toward a goal, holds context across turns, and can act on systems other than the chat surface (calling APIs, writing data, escalating, handing off). Anything that does not meet those three conditions is a feature, not an agent. The four merchant-run categories below all meet them; the fifth category meets them and faces your store from the outside.

The four merchant-run categories

Support agents

These resolve customer-service workflows: order status, returns, refunds, shipping questions, product compatibility. They live in the helpdesk or storefront chat. Their KPIs are containment rate (share of tickets resolved without human handoff) and first-contact resolution. The integration audit that matters: catalog sync (live or batch), order data access, refund/return write permissions, and a clean escalation path to humans for the cases the agent should not be deciding alone.

Category illustrations: Gorgias AI Agent (Shopify-first helpdesk), Fin by Intercom (omnichannel across messenger, email, voice, and inside Zendesk/Salesforce/HubSpot/Freshdesk), Ada (high-volume cross-channel automation), Zendesk AI (helpdesk augmentation), Salesforce Agentforce (CRM/service orchestration). Verify each vendor's current platform fit and tier positioning before shortlisting. The category is the oldest of the four. The bar is high. Most general-purpose chatbots have been re-marketed as agents in this space without adding real agentic behavior.

Sales and concierge agents

These work on the conversion side, not the support side. They appear in storefront chat or product pages, guide discovery, do conversational recommendations, recover carts, and in some cases complete checkout. Their KPIs are assisted-conversion lift, average order value, and cart-recovery rate. The integration audit that matters: catalog feed quality (a sales agent's recommendations are only as good as your product data), pricing/availability accuracy at the second the recommendation is shown, and the handoff to your existing checkout.

Category illustrations: Rep AI (sales concierge for Shopify), Alhena AI (all-in-one with sales focus), Zipchat AI (storefront chat with sales orientation). This category overlaps with support agents at the vendor level (several vendors do both), but the work, the KPIs, and the integration shape are not the same.

Operations agents

These do back-of-house and routing work: ticket routing, intelligent escalation, return processing, inventory orchestration, supplier or warranty workflows. The customer-facing surface is thinner than for support or sales agents. Their KPI is cost-per-resolution or hours saved per workflow. The integration audit that matters: how cleanly they slot into the helpdesk you already run, whether they need their own UI, and whether their automation can be audited (you will need to see what the agent did, not just what it concluded).

Category illustrations: Yuma AI, CoSupport AI, Maven AGI, and the ops-side capabilities inside broader platforms (some vendors straddle support and ops — verify which side a given vendor leans toward at evaluation time). Smaller category, but high ROI when the routing or back-office problem is real.

Discovery and personalization agents

These improve on-site product discovery: conversational search, guided selling, personalized recommendations, shop-by-image. They face customers in the discovery funnel rather than the support funnel. Their KPI is search-to-purchase conversion and recommendation click-through. The integration audit that matters: catalog completeness (attributes, taxonomy, images), how the agent treats out-of-stock items, and whether the search ranking it applies is auditable.

Category illustrations: Constructor, Klevu, Searchspring with AI-enhanced search, Findify. The category predates the AI-agent label. These companies were running ML-driven search for years before "agent" became the marketing word, and the current generation does real multi-turn conversational discovery and qualifies.

The fifth category: agents arriving as buyers

Every listicle of e-commerce AI agents stops at four categories. The fifth one is the agents your store does not run but receives.

ChatGPT Operator browses storefronts on behalf of users. Perplexity Comet does the same. Claude for Chrome runs on the user's browser as an agent. When a user asks an AI assistant to "find me a cordless drill that works with the batteries I already own and ships by Friday," the assistant might shop the question itself, on behalf of the user. Your store becomes a website that an agent visits, not a website that a human visits.

The implications are different from the four merchant-run categories. This is not a vendor you pick; it is traffic you receive, whether or not you optimized for it. The integration audit becomes: can these agents reach your site without being blocked by the bot defense layer (see Crawloria's piece on Cloudflare Bot Fight Mode blocking AI agents), and can they complete checkout if they get there (see Shopify ChatGPT checkout walls).

The reason this category gets skipped in listicles is structural. The listicles are paid placements or affiliate-shaped roundups, and there is no vendor on the buyer-agent side to pay for inclusion. The omission means most merchants spend on category 1-4 without instrumenting whether category 5 traffic is converting or bouncing. Both halves matter. Picking a sales-concierge agent that requires JS-heavy chat to function, while your bot-defense layer blocks the buyer-agent visiting the same page, is the kind of mismatch that the buyer's framework catches and a listicle does not.

If you want the broader frame on which agents visit and what they identify as, our piece on the four classes of AI bots covers the discovery-side picture in detail. For the discovery-signal side of buyer-agents specifically, see ecommerce GEO.

How to actually pick: a selection framework

Five questions to answer before evaluating any specific vendor. Answer them in this order; do not skip ahead.

  1. Which category do you need first? Match the bottleneck. If support volume is the pain, you need a support agent. If conversion at PDP is flat, you need a sales/concierge agent. If your catalog is messy and discovery is the bottleneck, no agent fixes that; fix data first.
  2. What is the integration depth? Native to your platform (Shopify, BigCommerce, Salesforce Commerce) usually beats a generic API integration. A native integration means catalog, order, and customer data sync without you maintaining the plumbing. A generic API means you maintain it.
  3. What KPI hook does the vendor offer? "Containment rate," "assisted conversion," "cost per resolution" are not interchangeable. Ask the vendor which KPI they report on, how it is calculated, and against what baseline. If they cannot answer in two sentences, the KPI is marketing.
  4. What is the audit story? When the agent makes a decision a human disagrees with (refunds the wrong order, recommends an out-of-stock SKU, escalates the wrong way), can you replay what it saw and what it did? Agents without audit trails are operational risk.
  5. What is the off-ramp? If you switch vendors in eighteen months, what comes with you and what stays? Catalog enrichments are usually portable; trained conversation flows are usually not.

Tier signals without naming prices: vendors with enterprise sales motion and contracted SSO/SAML are aimed at larger brands with formal procurement. Vendors with self-serve signup and per-seat or per-ticket pricing are aimed at SMB through mid-market. Vendors with usage-based pricing are aimed at variable-volume operations. Match tier to your operation; do not buy enterprise tooling for an SMB store.

What doesn't work

Don't pick by demo flash. The demo is built for the demo. The integration audit is what matters. A vendor with a polished sales loop and a thin integration will burn three months of merchant time before the gap becomes obvious.

Don't trust "supports your platform" without specifics. Ask exactly which API endpoints the vendor calls, how often catalog syncs, what happens when sync fails, and how returns are written back. If the vendor says "we have a Shopify app," ask whether the app is using the public API or whether they are pulling data through a custom data warehouse pipeline. The difference shows up in catalog freshness.

Don't deploy an agent without instrumenting a baseline first. Two weeks of measurement before the agent goes live, on whichever KPI is relevant for the category (average response time, containment rate, conversion at PDP, hours per workflow), gives you something to compare against. Without a baseline, you cannot tell whether the agent worked.

Don't ignore category 5. When a sales-concierge agent gates discovery behind a JS-heavy chat widget that buyer-agents cannot render, you are trading category-1-through-4 revenue for category-5 revenue. The two should be sized against each other, not picked independently.

Where to start

If you have not picked a category yet, spend a week instrumenting which workflow is the actual bottleneck. The categories rank-order themselves once you have data on tickets per week, cart abandonment rate, support cost per resolution, and PDP conversion. Pick the category with the worst metric, not the category with the loudest vendor.

For DTC merchants on Shopify specifically, our AI shopping monitoring for Shopify guide covers the discovery-side instrumentation in detail; pair it with whichever merchant-run category you pick first.

Run a Crawloria audit on your store before any agent integration. The audit returns whether AI search and shopping agents can reach your site, what schema is exposed, and which blockers might gate either merchant-run agents or buyer-side agents. It costs nothing and takes five minutes. It will not pick your vendor, but it will tell you whether your store is in shape to host an agent at all.