This text was written in collaboration with César Ortega, whose insights and discussions helped form the concepts offered right here.
the suitable information product begins with sitting down with enterprise companions to know day-to-day workflows, handoffs, and bottlenecks. On this article, we focus on a problem that doesn’t require a sophisticated answer, only a easy optimization downside. It’s a great instance of how fundamental instruments can nonetheless remedy high-value issues. Particularly, we deal with optimizing the task of on-line insurance coverage insurance policies to trusted companions (unbiased insurance coverage businesses: iia) at a world insurance coverage firm.
Unbiased insurance coverage businesses are privately owned intermediaries that promote insurance coverage insurance policies from a number of insurers. In contrast to giant insurance coverage corporations, they don’t design merchandise, set costs, underwrite danger, or pay claims; as an alternative, they evaluate choices throughout carriers, and place protection that most closely fits the shopper’s wants, usually incomes commissions for doing so. Right here, the thought is to work collectively to ship the perfect worth for each the company and the shopper.
Decreasing complexity
Optimization in the true world is a spectrum. At one finish are precise strategies that may show optimality, however they typically could be computationally heavy at scale and may battle as the issue grows in dimension and operational element. On the different finish are heuristics, starting from easy rule-based baselines which can be simple to elucidate however onerous to take care of as complexity grows (typically dwelling in giant excel sheets), to extra superior metaheuristics that scale properly computationally however could be more durable to justify, audit, or debug.
In follow, the best strategy typically sits within the center: pragmatic “good-enough” formulations, constructed with rigorously chosen constraints that replicate each enterprise guidelines and actual operational limits as human workload and repair high quality.
The aim will not be theoretical perfection, however an answer that’s deliverable, comparable in opposition to baselines, and straightforward to iterate. With a modular construction and a staged modeling technique, we will begin easy, measure influence with KPIs: tangible (time to task, optimum company choice, and so forth.) and intangible (keep away from unfair concentrations of insurance policies in a couple of businesses, and so forth.), and evolve the system by small, secure enhancements quite than ready months for a textbook-optimal mannequin.
That’s why we selected a light-weight optimization formulation. It captures the constraints that matter (capability, geographic eligibility, equity, and bucket combine) and delivers a deterministic, auditable reply quick sufficient for real-time latency necessities. If wanted, we will later lengthen the strategy with decomposition strategies, stronger solvers, or heuristics with out altering the system’s core contract.
The baseline
Traditionally, these digital policy-to-agency of assignments have been accomplished manually, guided by non-standard standards and particular person judgment. Whereas this strategy generally works, this typically resembled a round-robin strategy: insurance policies have been distributed sequentially amongst obtainable businesses (iia’s), with little consideration for variations in capability, experience, or anticipated efficiency.

Whereas easy and seemingly truthful, it typically results in delays, missed alternatives, and uncertainty about which company (iia) is the perfect match. The method additionally didn’t scale properly, creating additional task delays, and the outcomes didn’t constantly align with strategic objectives equivalent to profitability, high quality, reproducibility, and transparency.
Because of this, we current how we solved an vital downside utilizing a light-weight integer programming strategy that matches incoming on-line insurance coverage insurance policies to businesses in actual time. The tactic maximizes a productiveness rating (reflecting how properly an company has carried out previously) whereas balancing company capability, equity, and geographic admissibility constraints based mostly on ZIP codes. We define the mathematical formulation, the live-update logic, and the PuLP implementation.

What downside are we fixing?
When a brand new on-line coverage is bought for a shopper, somebody nonetheless has to resolve which company ought to deal with it. We depend on businesses as a result of they add worth past the same old, equivalent to advocating at declare time, servicing modifications and renewals, cross-selling, and extra. Importantly, businesses additionally originate demand: they convey new shoppers (and consequently new insurance policies) into the funnel by their relationships and native presence, which compounds development for the insurance coverage firm.
From a buyer perspective, this issues as a result of the company is usually the major level of contact: the standard and velocity of company (iia) service can form the general expertise, particularly throughout high-stress moments like claims or pressing protection modifications.
Since businesses differ in licensing, geography, product strengths, gross sales attain, and day-to-day capability, the “greatest” company can range from second to second. An actual-time task optimization system routes every new coverage to eligible, obtainable businesses which can be more than likely to ship worth to each the enterprise and the shopper, are handled pretty beneath clear guidelines, and are greatest positioned to drive future development.
Good Outdated-Original optimization
To create a transparent task course of, it’s important to think about broader enterprise objectives: equivalent to ensuring the suitable company handles the suitable kind of coverage to maximise key efficiency indicators (KPIs) like coverage quantity and high quality. It’s additionally vital that businesses perceive how these selections are made.
So, the carried out optimization algorithm ought to intelligently allocates insurance policies to businesses based mostly on KPIs, together with the quantity and high quality of insurance policies they deal with. As a substitute of counting on subjective or inconsistent human judgment, the algorithm makes use of real-time, data-driven selections to optimize the coverage task course of effectively and pretty.
The optimization mannequin allocates insurance policies to businesses based mostly on measurable efficiency alerts quite than subjective judgment. To make selections reproducible, we translate company efficiency right into a numeric worth the optimizer can use. That is accomplished by productiveness weights, the place the important thing enter is the swap ratio: a metric that captures how a lot worth an company brings per unit of coverage it receives (for instance loss ratio, tenure, premium, cross-selling, and so forth.).
In follow, the swap ratio permits the mannequin to distinguish businesses that constantly ship robust outcomes from those who underperform. Larger-value insurance policies can then be directed towards businesses which have demonstrated the power to deal with them successfully, whereas nonetheless respecting capability limits, geographic eligibility, equity necessities, and bucket-mix constraints.
Quite than counting on static guidelines, the system recalculates selections as constraints, guaranteeing that assignments stay aligned with present operational capability and enterprise priorities.
The system operates in two modes:
- Batch mode: Optimizes based mostly on historic allowances, offering a complete evaluate of previous information to enhance future allocations.
- On-line mode: Re-optimizes with every new incoming coverage, together with these new insurance policies within the optimization course of, then updates the stock and refines the batch optimization accordingly.
In essence, the batch mode handles historic information to ascertain baseline guidelines and patterns, whereas the web mode ensures real-time adaptability by dynamically adjusting to new insurance policies and circumstances. This strategy helps preserve optimum efficiency in a continuously altering atmosphere.
The Answer: Optimization Algorithm
Given a set of businesses A and an incoming stream of insurance policies P, we wish to resolve what number of insurance policies to assign to every company and every coverage class (Gold, Silver, Bronze) in order that we maximize complete productiveness whereas adhering to sure constraints (company capability, ZIP code eligibility, , complete depend, penalties, and so forth.).
Objetive perform:


- x is the choice variable within the optimization downside and represents the variety of insurance policies assigned to company a and class c, we solely handle optimistic integer values solely.
- A: set of businesses (dimension |A| = m); a∈A.
- C: set of classes {Gold, Silver, Bronze} (|C| = p = 3); c ∈ C.
- The productiveness weights w is one quantity per company that estimates the good thing about sending yet one more coverage to that company. That is calculated with the time the company have over the swap ratio.
Guidelines we should respect (constraints):
Logical constraints:
Logical constraints are those required for the mannequin to be mathematically well-defined no matter enterprise context (e.g., variables are integers and totals stability).
- Integrality & Non-negativity: you possibly can’t ship detrimental or fractional insurance policies.

2. International conservation: the whole variety of insurance policies assigned throughout all businesses and buckets should equal the whole stock obtainable for task on this run (the sum of all company capacities).

Enterprise constraints:
Enterprise constraints encode area coverage selections or operational guidelines (e.g., per‑company capability, ZIP admissibility, bucket combine, on-line flooring) that would change if the enterprise guidelines change.
- Per-agency capability: an company can not obtain extra insurance policies than it could at present deal with (Ua), which corresponds to the sum of the rows within the coverage task matrix.


2. ZIP admissibility: businesses are solely licensed or licensed to service insurance policies in particular geographic areas.
If a ZIP is inadmissible for company a, lock its row complete

By implementing ZIP eligibility within the optimization, we guarantee each task is operationally possible, defending service high quality, as a result of businesses are strongest within the areas the place they’ve native presence and experience.
3. Bucket bounds: enterprise management that maintain the month-to-month allocation balanced throughout coverage tiers.


With out them, the optimizer may push virtually every part into essentially the most worthwhile tier, which may create danger focus and operational pressure. By setting minimums and maximums per bucket, you implement a wholesome combine that displays danger urge for food, service capability, and strategic targets.
What’s Not within the batch
Batch mode is a full re‑optimization on a hard and fast stock. It finds the perfect baseline allocation with out reacting to a single new coverage occasion. For that cause, we exclude the next “reside” constraints which can be solely wanted when a brand new coverage arrives:
- Per‑company flooring from the earlier allocation. Flooring are a web-based safeguard that forestalls any company from shedding insurance policies when a brand new one arrives. In batch we’re computing the baseline itself, so there’s no “earlier” baseline to guard.

- ZIP lock is a reside‑mode security rule: when a single new coverage arrives, if that coverage’s ZIP is not allowed for company A, we freeze company A at cell degree (Gold/Silver/Bronze) at its earlier cell values so the brand new coverage can’t be assigned there and we don’t transfer any current insurance policies away.
- No headroom (“+1”) trick. Headroom is utilized in on-line mode to maintain feasibility when including precisely one new coverage. Batch mode doesn’t add a single coverage; it allocates the complete stock directly.
- Bucket bounds nonetheless apply on-line: every new coverage should maintain Gold/Silver/Bronze totals inside their min/max. These restrictions are up to date on a month-to-month foundation or as enterprise necessities change.
Why this works
By separating the method into batch (world stability) and on-line (native adjustment), the system achieves each stability and responsiveness. Batch optimization gives a constant, auditable reference level, whereas reside decisioning handles real-time arrivals with out disrupting the general construction. This mix permits quick operational selections whereas preserving equity, capability management, and alignment with strategic targets.
E2E Implementation
The tip-to-end course of entails greater than encoding guidelines in an optimization mannequin. In our AWS setup, Airflow orchestrates scheduled information pipelines that refresh intermediate tables on every day, weekly, and month-to-month cadences. These jobs pull upstream information, construct curated datasets and reside stock tables, and retailer them in S3. The Optimization service reads the newest inputs from S3 and, when wanted, calls a SageMaker endpoint to attain candidates and choose the perfect company beneath the capability, equity, and ZIP-code constraints described earlier. Exterior purposes ship requests by an HTTPS endpoint on API Gateway, which routes them by way of middleware liable for authentication, validation, and request transformation earlier than invoking the Optimization service (and SageMaker, if required). The response (containing the chosen company and choice metadata) is returned to the Contact Heart and in the end the top consumer. Lastly, outcomes and logs are written again to S3, feeding Airflow-driven monitoring and retraining, and Jenkins redeploys up to date parts to shut the loop.
Toy instance
To exemplify the mechanics of the unique manufacturing implementation in a simplified and self-contained method we create an artificial, runnable toy instance demonstrating the core logic behind policy-to-agency task utilizing linear integer programming with the PuLP library in Python.
The instance units up a small state of affairs with 4 businesses and three coverage classes (“Gold,” “Silver,” and “Bronze”). Productiveness scores and capability limits are assigned for every company, together with constraints equivalent to ZIP code eligibility and minimal/most coverage combine per class. The aim is to maximise the whole productiveness rating whereas respecting these constraints.
Whereas the instance is artificial and makes use of randomly generated weights and capacities, it successfully illustrates the elemental optimization logic and workflow, together with variable development, constraint enforcement, and answer interpretation. This strategy could be immediately scaled and tailored to real-world information and enterprise constraints as demonstrated within the full implementation.

In Desk 1, we illustrate a easy iteration. Batch mode first computes a baseline month-to-month plan that allocates the preliminary stock. On-line mode then simulates incoming insurance policies separately towards a goal month-to-month complete; every arrival triggers a re-optimization that preserves current allocations and assigns solely the incremental coverage to an eligible company (e.g., respecting ZIP admissibility). On this instance, the brand new coverage is a high-value (Gold) coverage and its ZIP is admissible for A1, so the increment goes to A1. If the ZIP have been inadmissible for A1, the coverage could be routed to the perfect admissible company as an alternative. This course of repeats till the month-to-month bucket goal is reached.
Code
The code is accessible on this repository: Link to the repository
To run the experiments, arrange a Python ≥3.11 atmosphere with the required libraries (e.g., pulp, and so forth.). It is strongly recommended to make use of a digital atmosphere (by way of venv or conda) to maintain dependencies remoted.
Conclusion
In comparison with a round-robin baseline that assigns insurance policies with no intelligence, our strategy makes use of a productiveness matrix derived from an swap ratio to route insurance policies the place they’re anticipated to create essentially the most worth. The optimization balances tangible metrics (the measurable worth and capability every company can ship) with intangible issues (equity, stability, and the belief businesses place in a predictable allocation course of). In brief, it replaces a blind rotation with a clear, auditable choice rule that displays each efficiency and operational constraints.
By making coverage assignments extra clear and predictable, we’ve constructed belief and collaboration. Companies (iia’s) now perceive how selections are being made, which has elevated their confidence within the course of.
This instance reveals how even a comparatively small optimization downside can generate significant enhancements. By beginning with a easy, well-defined formulation, we create a stable basis that delivers instant worth whereas enabling future evolution. The identical framework could be prolonged by incremental iterations, incorporating richer alerts, and extra superior choice logic. In follow, the best influence typically comes not from constructing a fancy system upfront, however from beginning easy and enhancing repeatedly because the enterprise learns and the information matures.
References
[1]PuLP documentation, “PuLP 3.3.0 documentation.” COIN-OR. https://coin-or.github.io/pulp/main/includeme.html

