Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • “From the shop floor to the boardroom” – Paris-based Pelico raises €34.7 million to advance GenAI in its Supply Chain Orchestration platform
    • Silk & Snow S&S Organic Mattress Review: Soft as a Cloud
    • Scientists once hoarded pre-nuclear steel, and now we’re hoarding pre-AI content
    • Today’s NYT Strands Hints, Answer and Help for June 18 #472
    • Meta offering $100m plus to poach my staff
    • I Won $10,000 in a Machine Learning Competition — Here’s My Complete Strategy
    • Portable flatpack camp stove for versatile outdoor cooking
    • Work smarter, not harder: Meet 10 European AI startups that will help you automate your business
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Wednesday, June 18
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»I Won $10,000 in a Machine Learning Competition — Here’s My Complete Strategy
    Artificial Intelligence

    I Won $10,000 in a Machine Learning Competition — Here’s My Complete Strategy

    Editor Times FeaturedBy Editor Times FeaturedJune 18, 2025No Comments7 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    in my first ML competitors and actually, I’m nonetheless a bit shocked.

    I’ve labored as a knowledge scientist in FinTech for six years. Once I noticed that Spectral Finance was working a credit score scoring problem for Web3 wallets, I made a decision to provide it a strive regardless of having zero blockchain expertise.

    Right here have been my limitations:

    • I used my laptop, which has no GPUs
    • I solely had a weekend (~10 hours) to work on it
    • I had by no means touched web3 or blockchain knowledge earlier than
    • I had by no means constructed a neural community for credit score scoring

    The competitors aim was easy: predict which Web3 wallets have been more likely to default on loans utilizing their transaction historical past. Basically, conventional credit score scoring however with DeFi knowledge as a substitute of financial institution statements.

    To my shock, I got here second and received $10k in USD Coin! Sadly, Spectral Finance has since taken the competitors website and leaderboard down, however right here’s a screenshot from after I received:

    My username was Ds-clau, second place with a rating of 83.66 (picture by creator)

    This expertise taught me that understanding the enterprise downside actually issues. On this publish, I’ll present you precisely how I did it with detailed explanations and Python code snippets, so you possibly can replicate this strategy in your subsequent machine studying challenge or competitors.

    Getting Began: You Don’t Want Costly {Hardware}

    Let me get this clear, you don’t essentially want an costly cloud computing setup to win ML competitions (except the dataset is simply too massive to suit domestically).

    The dataset for this competitors contained 77 options and 443k rows, which isn’t small by any means. The info got here as a .parquet file that I downloaded utilizing duckdb.

    I used my private laptop computer, a MacBook Professional with 16GB RAM and no GPU. The whole dataset match domestically on my laptop computer, although I have to admit the coaching course of was a bit sluggish.

    Perception: Intelligent sampling methods get you 90% of the insights with out the excessive computational prices. Many individuals get intimidated by massive datasets and assume they want massive cloud situations. You can begin a challenge domestically by sampling a portion of the dataset and analyzing the pattern first.

    EDA: Know Your Information

    Right here’s the place my fintech background turned my superpower, and I approached this like another credit score threat downside.

    First query in credit score scoring: What’s the category distribution?

    Seeing the 62/38 break up made me shiver… 38% is a very excessive default price from a enterprise perspective, however fortunately, the competitors wasn’t about pricing this product.

    Subsequent, I wished to see which options really mattered:

    That is the place I received excited. The patterns have been precisely what I’d anticipate from credit score knowledge:

    • risk_factor was the strongest predictor and confirmed > 0.4 correlation with the goal variable (greater threat actor = extra more likely to default)
    • time_since_last_liquidated confirmed a powerful damaging correlation, so the extra just lately they final liquidated, they riskier they have been. This strains up as anticipated, since excessive velocity is normally a excessive threat sign (latest liquidation = dangerous)
    • liquidation_count_sum_eth urged that debtors with greater liquidation counts in ETH have been threat flags (extra liquidations = riskier behaviour)

    Perception: Taking a look at Pearson correlation is an easy but intuitive approach to perceive linear relationships between options and the goal variable. It’s a good way to achieve instinct on which options ought to and shouldn’t be included in your ultimate mannequin.

    Characteristic Choice: Much less is Extra

    Right here’s one thing that at all times puzzles executives after I clarify this to them:

    Extra options doesn’t at all times imply higher efficiency.

    Actually, too many options normally imply worse efficiency and slower coaching, as a result of additional options add noise. Each irrelevant function makes your mannequin a bit bit worse at discovering the true patterns.

    So, function choice is a vital step that I by no means skip. I used recursive function elimination to seek out the optimum variety of options. Let me stroll you thru my actual course of:

    The candy spot was 34 options. After this level, the mannequin efficiency as measured by the AUC rating didn’t enhance with further options. So, I ended up utilizing lower than half of the given options to coach my mannequin, going from 77 options right down to 34.

    Perception: This discount in options eradicated noise whereas preserving sign from the vital options, resulting in a mannequin that was each quicker to coach and extra predictive.

    Constructing the Neural Community: Easy But Highly effective Structure

    Earlier than defining the mannequin structure, I needed to outline the dataset correctly:

    1. Cut up into coaching and validation units (for verifying outcomes after mannequin coaching)
    2. Scale options as a result of neural networks are very delicate to outliers
    3. Convert datasets to PyTorch tensors for environment friendly computation

    Right here’s my actual knowledge preprocessing pipeline:

    Now comes the enjoyable half: constructing the precise neural community mannequin.

    Necessary context: Spectral Finance (the competitors organizer) restricted mannequin deployments to solely neural networks and logistic regression due to their zero-knowledge proof system.

    ZK proofs require mathematical circuits that may cryptographically confirm computations with out revealing underlying knowledge, and neural networks and logistic regression will be effectively transformed into ZK circuits.

    Because it was my first time constructing a neural community for credit score scoring, I wished to maintain issues easy however efficient. Right here’s my mannequin structure:

    Let’s stroll by way of my structure selection intimately:

    • 5 hidden layers: Deep sufficient to seize advanced patterns, shallow sufficient to keep away from overfitting
    • 64 neurons per layer: Good steadiness between capability and computational effectivity
    • ReLU activation: Customary selection for hidden layers, prevents vanishing gradients
    • Dropout (0.2): Prevents overfitting by randomly zeroing 20% of neurons throughout coaching
    • Sigmoid output: supreme for binary classification, outputs possibilities between 0 and 1

    Coaching the Mannequin: The place the Magic Occurs

    Now for the coaching loop that kicks off the mannequin studying course of:

    Listed here are some particulars on the mannequin coaching course of:

    • Early stopping: Prevents overfitting by stopping when validation efficiency stops enhancing
    • SGD with momentum: Easy however efficient optimizer selection
    • Validation monitoring: Important for monitoring actual efficiency, not simply coaching loss

    The coaching curves confirmed regular enhancements with out overfitting through the coaching course of. That is precisely what I wished to see.

    Model training loss surves
    Mannequin coaching loss curves (picture by creator)

    The Secret Weapon: Threshold Optimization

    Right here’s the place I most likely outperformed others with extra sophisticated fashions within the competitors: I wager most individuals submitted predictions with the default 0.5 threshold.

    However because of the class imbalance (~38% of loans defaulted), I knew that the default threshold could be suboptimal. So, I used precision-recall evaluation to select a greater cutoff.

    I ended up maximizing the F1 rating, which is the harmonic imply between precision and recall. The optimum threshold primarily based on the best F1 rating was 0.35 as a substitute of 0.5. This single change improved my competitors rating by a number of share factors, probably the distinction between inserting and profitable.

    Perception: In the true world, several types of errors have completely different prices. Lacking a default loses you cash, which rejecting a very good buyer simply loses you potential revenue. The edge ought to replicate this actuality and shouldn’t be set arbitrarily at 0.5.

    Conclusion

    This competitors strengthened one thing I’ve identified for some time:

    Success in machine studying isn’t about having the fanciest instruments or essentially the most advanced algorithms.

    It’s about understanding your downside, making use of strong fundamentals, and specializing in what really strikes the needle.

    You don’t want a PhD to be a knowledge scientist or win a ML competitors.

    You don’t have to implement the most recent analysis papers.

    You additionally don’t want costly cloud sources.

    What you do want is area information, strong fundamentals, consideration to particulars that others would possibly overlook (like threshold optimization).


    Wish to construct your AI expertise?

    👉🏻 I run the AI Weekender, which options enjoyable weekend AI tasks and fast, sensible suggestions that will help you construct with AI.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Apply Sphinx’s Functionality to Create Documentation for Your Next Data Science Project

    June 18, 2025

    Abstract Classes: A Software Engineering Concept Data Scientists Must Know To Succeed

    June 18, 2025

    LLaVA on a Budget: Multimodal AI with Limited Resources

    June 17, 2025

    Build an AI Agent to Explore Your Data Catalog with Natural Language

    June 17, 2025

    Regularisation: A Deep Dive into Theory, Implementation, and Practical Insights

    June 17, 2025

    Grad-CAM from Scratch with PyTorch Hooks

    June 17, 2025
    Leave A Reply Cancel Reply

    Editors Picks

    “From the shop floor to the boardroom” – Paris-based Pelico raises €34.7 million to advance GenAI in its Supply Chain Orchestration platform

    June 18, 2025

    Silk & Snow S&S Organic Mattress Review: Soft as a Cloud

    June 18, 2025

    Scientists once hoarded pre-nuclear steel, and now we’re hoarding pre-AI content

    June 18, 2025

    Today’s NYT Strands Hints, Answer and Help for June 18 #472

    June 18, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Waymo Expands to More Cities: Everything to Know About the Growing Robotaxi Service

    February 1, 2025

    Uber terms mean couple can’t sue after ‘life-changing’ crash

    September 29, 2024

    Microsoft’s new “passwordless by default” is great but comes at a cost

    May 5, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.