Designing Trustworthy ML Models: Alan & Aida Discover Monotonicity in Machine Learning

Machine studying fashions are highly effective, however generally they produce predictions that break human instinct.

Think about this: you’re predicting home costs. A 2,000 sq. ft. house is predicted cheaper than a 1,500 sq. ft. house. Sounds flawed, proper?

That is the place monotonicity constraints step in. They ensure fashions comply with the logical enterprise guidelines we count on.

Let’s comply with two colleagues, Alan and Aida, on their journey to find why monotonicity issues in machine studying.

The Story: Alan & Aida’s Discovery

Alan is a sensible engineer. Aida is a principled scientist. Collectively, they’re constructing a home value prediction mannequin.

Alan proudly reveals Aida his mannequin outcomes:

“Look! R² is nice, the error is low. We’re able to deploy!”

Aida takes the mannequin out for testing:

For a home with 1500 sq ft → Mannequin predicts $300,000
For a home with 2000 sq ft → Mannequin predicts $280,000 😮

Aida frowns as she appears on the predictions:

“Wait a second… Why is that this 2,000 sq. ft. house predicted cheaper than a 1,500 sq. ft. house? That doesn’t make sense.”

Alan shrugs:

“That’s as a result of the mannequin discovered noise within the coaching information. It’s not at all times logical. However, the accuracy is nice general. Isn’t that sufficient?”

Aida shakes her head:

“Probably not. A reliable mannequin should not solely be correct but in addition comply with logic folks can belief. Clients received’t belief us if larger properties generally look cheaper. We want a assure. It is a monotonicity downside.”

And identical to that, Alan learns his subsequent massive ML lesson: metrics aren’t all the things.

What’s Monotonicity in ML?

Aida explains:

“Monotonicity means predictions transfer in a constant route as inputs change. It’s like telling the mannequin: as sq. footage goes up, value ought to by no means go down. We name it Monotone growing. Or, as one other instance, as a home age will get older, predicted costs mustn’t go up. We name this Monotone lowering.”

Alan concludes that:

“So monotonicity right here issues as a result of it:

Aligns with enterprise logic, and
Improves belief & interpretability.”

Aida nodded:

“Sure, Plus, it helps meet equity & regulatory expectations.”

Visualizing the Drawback

Aida creates a toy dataset in Pandas to point out the issue:

import pandas as pd

# Instance toy dataset
information = pd.DataFrame({
   "sqft": [1200, 1500, 1800, 2000, 2200, 2250],
   "predicted_price": [250000, 270000, 260000, 280000, 290000, 285000]  # Discover dip at 1800 sqft and 2250 sqft
})

# Type by sqft
data_sorted = information.sort_values("sqft")

# Verify variations in goal
data_sorted["price_diff"] = data_sorted["predicted_price"].diff()

# Discover monotonicity violations (the place value decreases as sqft will increase)
violations = data_sorted[data_sorted["price_diff"] < 0]
print("Monotonicity violations:n", violations)

Monotonicity violations:
    sqft   value  price_diff
2  1800  260000    -10000.0
5  2250  285000     -5000.0

After which she plots the violations:

import matplotlib.pyplot as plt
plt.determine(figsize=(7,5))
plt.plot(information["sqft"], information["predicted_price"], marker="o", linestyle="-.", shade="steelblue", label="Predicted Worth")


# Spotlight the dips
for sqft, value, price_diff in violations.values:
 plt.scatter(sqft, value, shade="crimson", zorder=5)
 plt.textual content(x=sqft, y=price-3000, s="Dip!", shade="crimson", ha="middle")


# Labels and title
plt.title("Predicted Home Costs vs. Sq. Footage")
plt.xlabel("Sq. Footage (sqft)")
plt.ylabel("Predicted Worth ($)")
plt.grid(True, linestyle="--", alpha=0.6)
plt.legend()

Aida factors to the Dips: “Right here’s the issue: 1,800 sq. ft. is priced decrease than 1,500 sq. ft. and a couple of,250 sq. ft. is priced decrease than 2,200 sq. ft.”

Fixing It with Monotonicity Constraints in XGBoost

Alan retrains the mannequin and units a monotonic growing constraint on sq. footage and monotonic lowering constraint on home age.
This forces the mannequin to at all times

enhance (or keep the identical) when sq. footage will increase given all different options are mounted.
lower (or keep the identical) when home age will increase given all different options are mounted.

He makes use of XGBoost that makes it simple to implement monotonicity:

import xgboost as xgb
from sklearn.model_selection import train_test_split

df = pd.DataFrame({
   "sqft": [1200, 1500, 1800, 2000, 2200],
   "house_age": [30, 20, 15, 10, 5],
   "value": [250000, 270000, 280000, 320000, 350000]
})

X = df[["sqft", "house_age"]]
y = df["price"]

X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                    test_size=0.2, random_state=42)

monotone_constraints = {
   "sqft": 1,        # Monotone growing
   "house_age": -1   # Monotone lowering
}

mannequin = xgb.XGBRegressor(
   monotone_constraints=monotone_constraints,
   n_estimators=200,
   learning_rate=0.1,
   max_depth=4,
   random_state=42
)

mannequin.match(X_train, y_train)

print(X_test)
print("Predicted value:", mannequin.predict(X_test.values))

  sqft  house_age
1  1500         20
Predicted value: [250000.84]

Alan palms over the brand new mannequin to Aida. “Now the mannequin respects area data. Predictions for bigger homes won’t ever dip under smaller ones.”

Aida assessments the mannequin once more:

1500 sq ft → $300,000
2000 sq ft → $350,000
2500 sq ft → $400,000

Now she sees a smoother plot of home costs vs sq. footage.

import matplotlib.pyplot as plt

data2 = pd.DataFrame({
  "sqft": [1200, 1500, 1800, 2000, 2200, 2250],
  "predicted_price": [250000, 270000, 275000, 280000, 290000, 292000]
})

plt.determine(figsize=(7,5))
plt.plot(data2["sqft"], data2["predicted_price"], marker="o", 
                     linestyle="-.", shade="inexperienced", label="Predicted Worth")

plt.title("Monotonic Predicted Home Costs vs. Sq. Footage")
plt.xlabel("Sq. Footage (sqft)")
plt.ylabel("Predicted Worth ($)")
plt.grid(True, linestyle="--", alpha=0.6)
plt.legend()

Aida: “Excellent! When properties are of the identical age, a bigger measurement constantly results in a better or equal value. Conversely, properties of the identical sq. footage will at all times be priced decrease if they’re older.”

Alan: “Sure — we gave the mannequin guardrails that align with area data.”

Actual-World Examples

Alan: what different domains can profit from monotonicity constraints?

Aida: Anyplace prospects or cash are concerned, monotonicity can impression belief. Some domains the place monotonicity actually issues are:

Home pricing → Bigger properties shouldn’t be priced decrease.
Mortgage approvals → Increased revenue mustn’t scale back approval likelihood.
Credit score scoring → Longer reimbursement historical past mustn’t decrease the rating.
Buyer lifetime worth (CLV) → Extra purchases shouldn’t decrease CLV predictions.
Insurance coverage pricing → Extra protection mustn’t scale back the premium.

Takeaways

Accuracy alone doesn’t assure trustworthiness.
Monotonicity ensures predictions align with widespread sense and enterprise guidelines.
Clients, regulators, and stakeholders usually tend to settle for and use fashions which are each correct and logical.

As Aida reminds Alan:

“Make fashions not simply sensible, however wise.”

Closing Ideas

Subsequent time you construct a mannequin, don’t simply ask: How correct is it? Additionally ask: Does it make sense to the individuals who’ll use it?

Monotonicity constraints are considered one of many instruments for designing reliable ML fashions — alongside explainability, equity constraints, and transparency.

. . .

Thanks for studying! I usually share insights on sensible AI/ML strategies—let’s join on LinkedIn should you’d prefer to proceed the dialog.

Source link

Designing Trustworthy ML Models: Alan & Aida Discover Monotonicity in Machine Learning

System Design Series: Apache Flink from 10,000 Feet, and Building a Flink-powered Recommendation Engine

Agentic AI: How to Save on Tokens

4 YAML Files Instead of PySpark: How We Let Analysts Build Data Pipelines Without Engineers

Ensembles of Ensembles of Ensembles: A Guide to Stacking

How AI Policy in South Africa Is Ruining Itself

PyTorch NaNs Are Silent Killers — So I Built a 3ms Hook to Catch Them at the Exact Layer

AI chess robot offers physical game play and coaching

GAMING: Are you getting crushed in Pokemon Champions too?

Female Looksmaxxer Alorah Ziva Is Suing Clavicular for Alleged Battery

US soldier pleads not guilty in first prediction market insider trading case tied to Polymarket bets

Featured Picks

Zillow Has Gone Wild—for AI

New solar tech makes panels blend in beautifully

Why has Microsoft been routing example.com traffic to a company in Japan?

Designing Trustworthy ML Models: Alan & Aida Discover Monotonicity in Machine Learning

The Story: Alan & Aida’s Discovery

What’s Monotonicity in ML?

Visualizing the Drawback

Fixing It with Monotonicity Constraints in XGBoost

Actual-World Examples

Takeaways

Closing Ideas

Related Posts