Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Francis Bacon and the Scientific Method
    • Proxy-Pointer RAG: Structure Meets Scale at 100% Accuracy with Smarter Retrieval
    • Sulfur lava exoplanet L 98-59 d defies classification
    • Hisense U7SG TV Review (2026): Better Design, Great Value
    • Google is in talks with Marvell Technology to develop a memory processing unit that works alongside TPUs, and a new TPU for running AI models (Qianer Liu/The Information)
    • Premier League Soccer: Stream Man City vs. Arsenal From Anywhere Live
    • Dreaming in Cubes | Towards Data Science
    • Onda tiny house flips layout to fit three bedrooms and two bathrooms
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Sunday, April 19
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»AI Technology News»How Cerebras + DataRobot Accelerates AI App Development
    AI Technology News

    How Cerebras + DataRobot Accelerates AI App Development

    Editor Times FeaturedBy Editor Times FeaturedFebruary 1, 2025No Comments5 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    Quicker, smarter, extra responsive AI applications – that’s what your customers count on. However when giant language fashions (LLMs) are sluggish to reply, person expertise suffers. Each millisecond counts. 

    With Cerebras’ high-speed inference endpoints, you may scale back latency, pace up mannequin responses, and keep high quality at scale with fashions like Llama 3.1-70B. By following a couple of easy steps, you’ll be capable to customise and deploy your personal LLMs, providing you with the management to optimize for each pace and high quality.

    On this weblog, we’ll stroll you thru you the way to:

    • Arrange Llama 3.1-70B within the DataRobot LLM Playground.
    • Generate and apply an API key to leverage Cerebras for inference.
    • Customise and deploy smarter, sooner functions.

    By the top, you’ll be able to deploy LLMs that ship pace, precision, and real-time responsiveness.

    Prototype, customise, and check LLMs in a single place

    Prototyping and testing generative AI fashions typically require a patchwork of disconnected instruments. However with a unified, integrated environment for LLMs, retrieval strategies, and analysis metrics, you may transfer from thought to working prototype sooner and with fewer roadblocks.

    This streamlined process means you may give attention to constructing efficient, high-impact AI functions with out the effort of piecing collectively instruments from totally different platforms.

    Let’s stroll by way of a use case to see how one can leverage these capabilities to develop smarter, faster AI applications. 

    Use case: Rushing up LLM interference with out sacrificing high quality

    Low latency is important for constructing quick, responsive AI functions. However accelerated responses don’t have to come back at the price of high quality. 

    The pace of Cerebras Inference outperforms different platforms, enabling builders to construct functions that really feel easy, responsive, and clever.

    When mixed with an intuitive growth expertise, you may:

    • Scale back LLM latency for sooner person interactions.
    • Experiment extra effectively with new fashions and workflows.
    • Deploy functions that reply immediately to person actions.

    The diagrams beneath present Cerebras’ efficiency on Llama 3.1-70B, illustrating sooner response instances and decrease latency than different platforms. This permits fast iteration throughout growth and real-time efficiency in manufacturing.

    Image showing response time of llama 3.1 70B with Cerebras

    How mannequin measurement impacts LLM pace and efficiency

    As LLMs develop bigger and extra advanced, their outputs develop into extra related and complete — however this comes at a value: elevated latency. Cerebras tackles this problem with optimized computations, streamlined information switch, and clever decoding designed for pace.

    These pace enhancements are already remodeling AI functions in industries like prescribed drugs and voice AI. For instance:

    • GlaxoSmithKline (GSK) makes use of Cerebras Inference to speed up drug discovery, driving larger productiveness.
    • LiveKit has boosted the efficiency of ChatGPT’s voice mode pipeline, attaining sooner response instances than conventional inference options.

    The outcomes are measurable. On Llama 3.1-70B, Cerebras delivers 70x sooner inference than vanilla GPUs, enabling smoother, real-time interactions and sooner experimentation cycles.

    This efficiency is powered by  Cerebras’ third-generation Wafer-Scale Engine (WSE-3), a customized processor designed to optimize the tensor-based, sparse linear algebra operations that drive LLM inference.

    By prioritizing efficiency, effectivity, and suppleness, the WSE-3 ensures sooner, extra constant outcomes throughout mannequin efficiency.

    Cerebras Inference’s pace reduces the latency of AI functions powered by their fashions, enabling deeper reasoning and extra responsive person experiences. Accessing these optimized fashions is easy — they’re hosted on Cerebras and accessible through a single endpoint, so you can begin leveraging them with minimal setup.

    Image showing tokens per second on Cerebras Inference

    Step-by-step: How one can customise and deploy Llama 3.1-70B for low-latency AI

    Integrating LLMs like Llama 3.1-70B from Cerebras into DataRobot permits you to customise, check, and deploy AI fashions in only a few steps.  This course of helps sooner growth, interactive testing, and larger management over LLM customization.

    1. Generate an API key for Llama 3.1-70B within the Cerebras platform.

    Image showing generating and API key on Cerebras

    2. In DataRobot, create a customized mannequin within the Mannequin Workshop that calls out to the Cerebras endpoint the place Llama 3.1 70B is hosted.

    Image of the model workshop on DataRobot (1)

    3. Throughout the customized mannequin, place the Cerebras API key throughout the customized.py file.

    Image of putting Cerebras API key into custom py file in DataRobot (1)

    4. Deploy the customized mannequin to an endpoint within the DataRobot Console, enabling  LLM blueprints to leverage it for inference.

    Image of deploying llama 3.1 70B on Cerebras in DataRobot

    5. Add your deployed Cerebras LLM to the LLM blueprint within the DataRobot LLM Playground to begin chatting with Llama 3.1 -70B.

    Image of adding an LLM to the playground in DataRobot

    6. As soon as the LLM is added to the blueprint, check responses by adjusting prompting and retrieval parameters, and examine outputs with different LLMs immediately within the DataRobot GUI.

    Image of the DataRobot playground

    Broaden the bounds of LLM inference to your AI functions

    Deploying LLMs like Llama 3.1-70B with low latency and real-time responsiveness is not any small activity. However with the suitable instruments and workflows, you may obtain each.

    By integrating LLMs into DataRobot’s LLM Playground and leveraging Cerebras’ optimized inference, you may simplify customization, pace up testing, and scale back complexity – all whereas sustaining the efficiency your customers count on. 

    As LLMs develop bigger and extra highly effective, having a streamlined course of for testing, customization, and integration, shall be important for groups seeking to keep forward. 

    Discover it your self. Entry Cerebras Inference, generate your API key, and begin constructing AI applications in DataRobot.

    Concerning the creator

    Kumar Venkateswar
    Kumar Venkateswar

    VP of Product, Platform and Ecosystem

    Kumar Venkateswar is VP of Product, Platform and Ecosystem at DataRobot. He leads product administration for DataRobot’s foundational companies and ecosystem partnerships, bridging the gaps between environment friendly infrastructure and integrations that maximize AI outcomes. Previous to DataRobot, Kumar labored at Amazon and Microsoft, together with main product administration groups for Amazon SageMaker and Amazon Q Enterprise.


    Meet Kumar Venkateswar


    Nathaniel Daly
    Nathaniel Daly

    Principal Product Supervisor

    Nathaniel Daly is a Senior Product Supervisor at DataRobot specializing in AutoML and time collection merchandise. He’s targeted on bringing advances in information science to customers such that they’ll leverage this worth to resolve actual world enterprise issues. He holds a level in Arithmetic from College of California, Berkeley.


    Meet Nathaniel Daly



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    How robots learn: A brief, contemporary history

    April 17, 2026

    Vibe Coding Best Practices: 5 Claude Code Habits

    April 16, 2026

    Why having “humans in the loop” in an AI war is an illusion

    April 16, 2026

    Making AI operational in constrained public sector environments

    April 16, 2026

    Treating enterprise AI as an operating layer

    April 16, 2026

    Building trust in the AI era with privacy-led UX

    April 15, 2026

    Comments are closed.

    Editors Picks

    Francis Bacon and the Scientific Method

    April 19, 2026

    Proxy-Pointer RAG: Structure Meets Scale at 100% Accuracy with Smarter Retrieval

    April 19, 2026

    Sulfur lava exoplanet L 98-59 d defies classification

    April 19, 2026

    Hisense U7SG TV Review (2026): Better Design, Great Value

    April 19, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Today’s NYT Mini Crossword Answers for July 1

    July 1, 2025

    Harnessing human-AI collaboration for an AI roadmap that moves beyond pilots

    December 5, 2025

    Apple Still Plans to Sell iPhones When It Turns 100

    March 29, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.