A vs B is the A/B testing and feature-flag platform for teams who’d rather measure than meet. Build variations in JavaScript, TypeScript, CSS, or SCSS, bucket users with a 18.8 KB gzipped snippet, and read results in Bayesian, Frequentist, or Sequential.
Winning probability, observed lift, projected revenue impact, and days to your sample-size target — up top.
Slice it any way
Filter by date range, baseline, and segment — device, country, browser, platform, language, or new vs returning.
Three engines, one dataset
Read the same experiment through Bayesian, Frequentist, and Sequential side by side — agreement reassures, disagreement informs.
Every arm, in detail
Visitors, conversions, conversion rate, lift, confidence interval, and significance for every variation.
Trust the number
Segment lift across six dimensions, plus SRM, statistical-confidence, and traffic-health guardrails.
Ship it your way
Traffic allocation, A/A validation, and a code editor for each variation.
You have opinions.
Your users have behaviors.
Let them vote.
A vs B · operating manual · page 001
02 — Walkthrough
See it work. Scene by scene.
01 — Overview
Every experiment and flag in one place.
The dashboard surfaces all your experiments and feature flags at a glance — lifecycle status (Draft, Scheduled, Running, Paused, Completed), key metrics, and quick actions, so nothing slips through.
02 — Results
Statistical readout, no guessing.
The results page shows per-variation visitor counts, conversion rates, and a time-series chart bucketed at 4-hour intervals (ClickHouse INTERVAL 4 HOUR). Health guardrails surface sample-ratio mismatches before you draw conclusions.
03 — Targeting
Precise audiences, nested rules.
Define who sees each variation using the five-step guided builder. Targeting is step one: compose AND / OR audience conditions, reuse segments across experiments, and ship to exactly the right users.
04 — Variations
Code-block builder. JS, TS, CSS, or SCSS.
Write control and variant code directly in Monaco with full TypeScript support. The five-step builder walks you through Targeting → Variations → Metrics → Analysis → Review before anything goes live.
01 — Overview
Every experiment and flag in one place.
The dashboard surfaces all your experiments and feature flags at a glance — lifecycle status (Draft, Scheduled, Running, Paused, Completed), key metrics, and quick actions, so nothing slips through.
02 — Results
Statistical readout, no guessing.
The results page shows per-variation visitor counts, conversion rates, and a time-series chart bucketed at 4-hour intervals (ClickHouse INTERVAL 4 HOUR). Health guardrails surface sample-ratio mismatches before you draw conclusions.
03 — Targeting
Precise audiences, nested rules.
Define who sees each variation using the five-step guided builder. Targeting is step one: compose AND / OR audience conditions, reuse segments across experiments, and ship to exactly the right users.
04 — Variations
Code-block builder. JS, TS, CSS, or SCSS.
Write control and variant code directly in Monaco with full TypeScript support. The five-step builder walks you through Targeting → Variations → Metrics → Analysis → Review before anything goes live.
03 — Experiment builder
Build any variation. Point-and-click or pure code.
The visual editor browser extension lets non-engineers create variations without touching code. The code builder gives engineers full TypeScript control — with linting, SCSS, and draft history — in the same experiment.
6change types
5viewport targets
4max variations
3browsers supported
AvsB Visual EditorChrome · Firefox · Edge
AllMobileTabletDesktopCustom
TEXTSTYLEVISIBILITYIMAGEREORDERINSERT
Point-and-click modifications across text, styles, visibility, images, reordering, and HTML insertion — no code required.
Draft history preserves version snapshots by experiment, author, and base version — restore to any previous state.
Preview links share any variation via token-based URL (30–90 day expiry, revocable) — no account required.
Each variation gets its own files — JavaScript or TypeScript for behaviour, CSS or SCSS for styling, compiled in-browser.
Your code lives in initVariation(options) — options gives you self-cleaning waitUntil, timers, and listeners, plus onRemove for teardown on SPA navigation.
Linting: Off (no checks), On (advisory type + syntax squiggles), or Strict (blocks save and publish on type errors).
Up to 4 variations per experiment — 1 control and 3 challenger variants.
04 — Feature flags
Not just experiments. Feature flags, first-class.
A vs B is a feature-flag platform too. Boolean, string, number, and JSON flags — with targeted-delivery or A/B-test rules evaluated by the same audience engine as your experiments, in the browser snippet or your server SDKs. Per-user overrides, environment config, and stale-flag detection are built in.
4flag types
∞environments
4variations / rule
14dstale-flag check
Flags
BooleanStringNumberJSON
new-checkoutTargeted delivery100% · prod
ai-autocompleteA/B test25% · staging
legacy-pricingStale — flagged0% · archived
Create your own environments — prod, staging, dev, and beyond, each with its own SDK key. Instant rollout and rollback, no redeploy.
05 — Statistical rigor
Three engines. One platform.
Engine comparison
Bayesian
Frequentist
Sequential
Method
Beta-Binomial model
Two-proportion z-test
Asymptotic Confidence Sequences
Prior
Beta(1,1) — flat, non-informative
None (pooled variance)
None (AsympCS, Howard et al. 2021)
Primary output
Probability to beat control
p-value + confidence interval
Always-valid p-value
Interval
95% credible interval
95% confidence interval
Anytime confidence sequence
Peeking penalty
None
Yes — fixed horizon required
None — stop any time
CUPED / variance reduction
AUTO or OFF
AUTO or OFF
AUTO or OFF
Multiple-comparison correction
Bonferroni · Holm · BH · Tiered · None
Bonferroni · Holm · BH · Tiered · None
Bonferroni · Holm · BH · Tiered · None
ROPE
Optional — configurable bounds
—
—
Metric aggregation measures
binary
Unique Conversions Per Visitor
One conversion counted per visitor regardless of how many times the event fires. Classic click-through and sign-up metric.
count
Total Events
Raw count of all qualifying events, including multiple per visitor. Good for page-view and engagement depth signals.
binary
Unique Visitors Who Fired
Distinct visitor count across any matching event. Useful for reach and funnel-entry measures.
continuous
Total Value Per Visitor
Sum of a numeric property divided by exposed visitors. Revenue per visitor, pages per session.
continuous
Total Value
Raw sum across all events. Use for absolute revenue or engagement lift rather than per-head rates.
advanced
Percentile
p0–p100 quantiles via ClickHouse quantileTDigest. Confidence intervals via bias-corrected bootstrap (default 1,000 resamples).
advanced
Rate (ratio)
Numerator divided by denominator across all visitors. Delta-method variance estimation for correct standard errors on ratio metrics.
advanced
Composite (weighted)
Weighted sum of multiple metric bindings with per-component covariance handling. Model a revenue-weighted engagement score in one metric.
⊘All continuous and ratio metrics support winsorization — extreme values are capped at a configurable upper percentile (default p99) and optional lower percentile before statistical computation.
Health guardrailsillustrative
SRM (chi-square)
p ≥ 0.01
0.001–0.01
p < 0.001
Statistical confidence
≥ 95%
80–94%
< 80%
Traffic health
1,000+
100–999
< 100
greenyellowred
Sample size calculator
Analysis plans
Pre-register hypotheses before launch. Plans seal at launch (locked read-only) and every amendment records the timestamp, actor, field name, before/after values, and reason.
A/A test mode
Validate your statistical engine and experiment setup via the isAATest flag, which runs a control-to-control comparison to confirm calibration before a live experiment.
Sample size calculator
Supports binary conversions (Frequentist, Bayesian, and Sequential engines), ratio metrics (delta-method variance), quantile percentiles (bootstrap simulation), and composite metrics (weighted pairwise correlation).
06 — The platform
Everything you need. Nothing you don’t.
Code-block builder. JS or TS, CSS or SCSS.
Define control.ts and variant.ts plus a shared triggers.ts. Monaco ships TypeScript types for window.avsb.*, so autocomplete works out of the box. SCSS compiles in-browser — no bundler required.
variant.ts
1// variant.ts — runs when triggers.ts calls activate()
Define click, pageview, and custom metrics at the project level — instrument once, reuse across every experiment and flag without re-instrumentation.
ClickCSS selector-based
PageviewURL pattern-based
CustomApplication code-fired
Flag rules. Two types.
Boolean, String, Number, and JSON flags. AB_TEST rules split traffic and wire to metrics. Targeted Delivery rules deterministically route by audience. Up to 4 variations. Per-user overrides always win.
Drop the script in your <head>. Loads from CDN as a single 60 KB minified file (18.8 KB gzipped). Anti-flicker hides the document via opacity:0 with a 3-second timeout. MurmurHash3 sticky bucketing into a 0–9999 integer space.
Boolean, string, number, and JSON flags. Targeted delivery or A/B test rules. Up to 4 variations per rule. Create your own environments. Per-user overrides. Stale-flag detection after 14 days of no change.
new-checkout100% · prod
ai-autocomplete25% · staging
legacy-pricing0% · archived
CLI for local dev.
Published as @avsbhq/cli v3.2.0. Clone an experiment, edit locally, preview live via the browser extension, push when ready. avsb dev opens a WebSocket server with live reloading.
bash
1$ npm i -g @avsbhq/cli
2$ avsb clone <projectId>
3$ avsb dev # WebSocket + live reload
4$ avsb push
Browser extension.
Chrome MV3. Preview variations on the live page. Toggle Page Reload (safe) or Hot Inject (experimental) mode in the popup. Watch events stream in real time.
signup-ctavariant B
hero-copycontrol
● EXPOSURE · hero-copy · variant A
Server-side SDKs.
@avsbhq/node for Node 18+ with Express and Fastify middleware, InMemory and Redis sticky bucketing, SSE streaming. @avsbhq/react wraps useSyncExternalStore for React 18. @avsbhq/next covers App Router and Pages Router.
typescript
1import{ AvsBServer }from'@avsbhq/node';
2const avsb =newAvsBServer({ sdkKey });
3const value =await avsb.evalFlag('new-checkout',false, ctx);
9 integrations + webhooks.
Send exposure and event data to your analytics stack. One-click setup with API keys.
Google Analytics
Mixpanel
Segment
Adobe Analytics
Amplitude
Heap
FullStory
Contentsquare
Custom + webhooks
Audiences, nested.
10 condition types — nested AND / OR at arbitrary depth. Reusable segments across experiments and flags.
Location
Device
Browser
Platform (OS)
Language
Query param
Cookie
New vs returning
Custom attribute
Custom JavaScript
Governance built in.
Custom roles with granular permissions. Audit logs on every change. 2FA. Per-environment config across every environment you create. API tokens. Webhooks on key events.
Custom roles
Audit logs
2FA
Custom environments
API tokens
Webhooks
Stop shipping
on a hunch.
Ship on evidence.
A vs B · operating manual · page 002
07 — The loop
Four steps. Zero guessing.
01
Hypothesize.
State what you expect. Attach the target audience and the primary metric.
H₁: B > A by ≥ 10%
02
Build it your way.
Visual editor for non-engineers — or JS / TypeScript + CSS / SCSS code blocks for engineers. Up to 4 variations per rule, Monaco autocomplete included.
03
Run — safely.
Snippet buckets the visitor. Anti-flicker hides the page until variations apply. Exposure + conversion events stream to ClickHouse.
04
Decide.
Bayesian, Frequentist, or Sequential — your call. Probability to beat control, credible or confidence interval, always-valid bound, SRM check, traffic-health band.
✓ Ship variant B
08 — Governance
Auditable by design. Not by accident.
Analysis plans — sealed at launch
Before your experiment goes live, commit your primary metric, statistical engine, and confidence target. The plan locks the moment traffic starts — it becomes read-only. Every subsequent amendment is timestamped, attributed to an actor, and records the field name, the before value, and the after value with a mandatory reason.
Pre-registrationSealed at launchAmendment trackingRead-only history
Frequentist experiments get three interlocking guards against peeking bias.
01Sample progress banner
A status strip shows sample collection progress so you know when you can trust the numbers.
02Blocking modal
Pause, stop, or declare-winner actions raise a modal that forces acknowledgment before proceeding.
03Override audit stamp
When a team member overrides the guard, the decision is recorded in the audit log with a timestamp and actor.
Decision logging — @avsbhq/node
Server-side SDK ships a createDecisionLog helper. Wire it into your own audit store or any structured log sink. Every decision — which variant was served, to which user, under which experiment — is captured as a structured record.
typescript
1import{ AvsBServer, createDecisionLog }
2from'@avsbhq/node';
3
4const avsb =newAvsBServer({ sdkKey });
5
6// returns the chosen variant + a structured log entry
7const{ variant, log }=await avsb
8.decide('checkout-cta', ctx);
9
10await db.auditLogs.create({
11 data:createDecisionLog(log)
12});
Auto-pause on error
When a variation’s error rate exceeds a configurable threshold (default 25%), the experiment pauses automatically. The error log tracks JavaScript exceptions, failed applies, and selector misses — with severity levels and error rates shown once the sample exceeds 50 exposed visitors.
Variation B — error rateAuto-paused
31%
Threshold: 25% · 142 exposed visitors
errorCannot read properties of null (.querySelector)
warningSelector miss: .checkout-hero-cta — 0 elements found
errorFailed apply: variation JS threw on DOMContentLoaded
Lifecycle with guard rails
Pausing stops new visitor bucketing and reverts active visitors to control — data is preserved and bucketing restarts on resume. Stopping is permanent: the experiment moves to Completed, all variation code is removed from production immediately, and historical results are preserved.
RunningBucketing visitors, collecting data
pause
stop
PausedReverts to control, resumable
CompletedIrreversible — code removed
Audit log — every change, timestamped and attributedRoles — granular access per actionMembers — invite, seat, and manage your team