The 12 Analysis Algorithms Every Data Team Needs (And Shouldn't Build Themselves) | Fig Blog

The Build Trap

Every mature data team eventually faces the same temptation: "We should just build this ourselves." The logic is appealing. You have smart engineers. You know your data. How hard could it be to write some analysis logic?

The answer, as anyone who has tried will tell you, is: the first version is easy. The second version is hard. And maintaining it while your business changes underneath you is a full-time job that never ends.

Fig ships 12 analysis algorithms that cover the most common — and most valuable — analytical patterns in business intelligence. Each one has been refined across hundreds of deployments and edge cases. Here's what they do, when to use them, and what the alternative looks like.

1. Root Cause Analysis (RCA)

Use this when a metric changes unexpectedly and you need to understand the causal chain behind it.

So that you can skip the 2-day manual investigation and go straight to the "what do we do about it" conversation.

RCA traverses your knowledge graph from the affected metric upstream through its causal drivers, testing each node for anomalies and quantifying contribution to the downstream change. It doesn't just find correlations — it follows the known causal structure of your business.

The alternative: Write custom SQL queries for each possible driver. Manually compare time periods. Build a spreadsheet to weight contributions. Repeat every time something changes.

2. Concentration Analysis

Use this when you need to understand whether a metric change is concentrated in a few entities or spread across many.

So that you can distinguish between systemic problems (market shifts, product issues) and isolated ones (key account churn, regional disruptions).

Concentration Analysis applies Pareto decomposition across any dimension — customers, products, regions, channels, sales reps. It identifies the minimum set of entities that explain the majority of a metric movement.

The alternative: Export data to a spreadsheet. Sort by contribution. Manually calculate cumulative percentages. Build a chart. Do this again for every dimension you want to check.

3. Trend Detection

Use this when you want to identify metrics that are trending in a direction that matters before they hit a crisis threshold.

So that you can catch slow-moving problems early — the kind that dashboards miss because no single data point looks alarming.

Trend Detection fits statistical models to your metric time series and identifies sustained directional movements, separating real trends from noise and seasonality. It flags metrics where the trajectory, if continued, will hit a meaningful threshold within a defined time horizon.

The alternative: Set static threshold alerts that either fire too often (noisy) or too late (useless). Manually review dozens of charts in a weekly meeting, hoping someone notices the slow decline.

4. Anomaly Detection

Use this when a metric value falls outside its expected range based on historical patterns, seasonality, and known business cycles.

So that you can react to unusual events immediately rather than discovering them days later during a reporting cycle.

Fig's Anomaly Detection goes beyond simple standard deviation thresholds. It accounts for seasonality, day-of-week patterns, growth trends, and known events. A 20% spike in support tickets on Black Friday isn't an anomaly. A 20% spike on a random Tuesday in March probably is.

The alternative: Build a statistical model for each metric. Maintain those models as data patterns change. Handle edge cases like holidays, product launches, and data pipeline delays. Staff someone to triage the alerts.

5. Correlation Discovery

Use this when you suspect two metrics are related but don't know the strength, direction, or lag time of the relationship.

So that you can build hypotheses about causal relationships that can be tested and added to your knowledge graph.

Correlation Discovery tests pairwise relationships across your metric catalog, scanning across multiple lag periods to find the strongest association. It distinguishes between leading, lagging, and coincident relationships.

The alternative: Write correlation queries in SQL or Python for each metric pair. Test multiple lag periods manually. Build a matrix of results. Try to filter signal from noise across hundreds of combinations.

6. Cohort Analysis

Use this when you need to compare how different groups of customers, users, or deals behave over time.

So that you can understand whether changes in aggregate metrics are driven by behavioral shifts in existing cohorts or by differences in the composition of new cohorts.

Cohort Analysis segments entities by their start date (or any defining characteristic) and tracks their behavior through a standardized timeline. This is essential for separating "our product is getting worse" from "we're acquiring a different type of customer."

The alternative: Write cohort queries from scratch. Manage the time-alignment logic. Handle edge cases around cohort boundaries. Rebuild when someone wants a different cohort definition.

7. Comparative Analysis

Use this when you need to rigorously compare two time periods, two segments, two products, or any two groups across multiple metrics simultaneously.

So that you can make apples-to-apples comparisons that account for seasonality, sample size, and statistical significance rather than eyeballing two bar charts.

Comparative Analysis normalizes for differences in scale and timing, applies appropriate statistical tests, and ranks the dimensions where the two groups differ most meaningfully.

The alternative: Pull data for both groups. Compute percentage changes. Argue about whether a 3% difference is meaningful or noise. Build a presentation that doesn't answer the follow-up questions.

8. Predictive Forecasting

Use this when you need forward-looking projections for any metric, incorporating seasonality, trends, and known causal drivers.

So that you can set realistic targets, plan resource allocation, and identify when a metric is likely to deviate from plan before it actually does.

Predictive Forecasting in Fig is knowledge-graph-aware. Instead of forecasting metrics in isolation, it uses the known causal relationships to produce forecasts that are consistent across the metric hierarchy. If you forecast a 10% increase in MQLs, the downstream forecasts for SQLs, Opportunities, and Revenue adjust accordingly.

The alternative: Build a forecasting model per metric (usually in a spreadsheet). Manually ensure consistency across related metrics. Rerun models quarterly. Explain why your pipeline forecast doesn't match your revenue forecast.

9. Segmentation Analysis

Use this when you want to discover natural groupings within your data that behave differently from each other.

So that you can tailor strategies to distinct segments rather than treating your entire customer base or product portfolio as monolithic.

Segmentation Analysis identifies clusters based on behavioral patterns rather than predefined categories. It might reveal that your "Enterprise" segment actually contains two distinct behavioral groups that respond to entirely different value propositions.

The alternative: Define segments based on intuition. Validate with ad hoc queries. Miss the segments you didn't think to look for. Operate on assumptions that may have been true two years ago.

10. Attribution Analysis

Use this when you need to understand which inputs, channels, or activities contributed to an outcome and in what proportion.

So that you can allocate resources to the channels and activities that actually drive results rather than the ones that merely correlate with them.

Attribution Analysis uses the knowledge graph to weight contributions based on causal strength and timing, going beyond simplistic first-touch or last-touch models to distribute credit based on measured impact.

The alternative: Implement a rules-based attribution model. Argue endlessly about the rules. Settle on last-touch because it's easiest. Make suboptimal allocation decisions based on an attribution model everyone knows is wrong.

11. Elasticity Measurement

Use this when you need to quantify how sensitive one metric is to changes in another — for example, how much revenue changes when you increase price by 1%.

So that you can make quantitative predictions about the impact of proposed changes before you implement them.

Elasticity Measurement fits response curves to historical data, accounting for confounding variables and non-linear relationships. It produces specific coefficients: "A 1% increase in X historically produces a 0.7% change in Y, with a 2-week lag."

The alternative: Run A/B tests for everything (expensive and slow). Or rely on gut feel (cheap and unreliable). Or build econometric models in R or Python that require a PhD to maintain.

12. Scenario Modeling

Use this when you want to understand the downstream impact of a proposed change across your entire business.

So that you can evaluate strategic decisions — pricing changes, market expansions, product launches — with a quantified view of expected outcomes and risk.

Scenario Modeling propagates a hypothetical change through your knowledge graph, using measured elasticities and lag times to project the cascading effects. "If we increase marketing spend by 20%, what happens to pipeline in 6 weeks, and revenue in 12 weeks?"

The alternative: Build a spreadsheet model. Fill it with assumptions. Present it to leadership as a "model" while knowing that the assumptions are the entire ballgame.

Why You Shouldn't Build These Yourself

Each of these algorithms seems straightforward in isolation. A competent data scientist could build a basic version of any one of them in a week or two. But there are three reasons the "build it yourself" approach consistently fails:

Edge cases are the product. The basic algorithm is 20% of the work. Handling null values, sparse data, seasonality, outliers, mixed granularities, and inconsistent time zones is the other 80%.

Maintenance compounds. Twelve algorithms, each requiring updates as your data model changes, as new edge cases emerge, as team members rotate. Within a year, you're spending more time maintaining analysis code than doing analysis.

Integration is the hard part. The real value isn't in any single algorithm — it's in how they work together. RCA calls Concentration Analysis to decompose its findings. Predictive Forecasting uses Elasticity Measurements. Anomaly Detection triggers RCA automatically. Building one algorithm is a project. Building an integrated suite is a platform.

The data team's job is to generate insight and drive decisions. The algorithms are the means, not the end. Use the ones that work, and spend your time on the problems that are actually unique to your business.