Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • How small businesses can leverage AI
    • Robots-Blog | Humanoide Robotik aus Deutschland: igus bringt neuen Serviceroboter auf den Markt
    • GM reimagines Hummer off-roader with California ideas unit
    • London’s DEScycle secures over €10 million in grant funding to scale critical metals recovery platform
    • How to Edit, Merge, and Split PDFs With Free Online Tools
    • Florida crackdown targets illegal machines in Sarasota
    • Audiophile-Oriented Noble Audio Debuts More Affordable Osprey Earbuds
    • New radio bursts detected from binary stars
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Tuesday, June 2
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»How ElevenLabs Voice AI Is Replacing Screens in Warehouse and Manufacturing Operations
    Artificial Intelligence

    How ElevenLabs Voice AI Is Replacing Screens in Warehouse and Manufacturing Operations

    Editor Times FeaturedBy Editor Times FeaturedMarch 27, 2026No Comments11 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    A choosing operation is the method of gathering objects from storage places to fulfil buyer orders.

    It is among the most labour-intensive actions in logistics, accounting for as much as 55% of complete warehouse working prices.

    Instance of warehouse structure the place operators want to select in a number of places – (Picture by Samir Saci)

    For every order, an operator receives an inventory of things to gather from their storage places.

    They stroll to every location, establish the product, choose the appropriate amount, and ensure the operation earlier than transferring to the subsequent line.

    In most warehouses, operators depend on RF scanners or handheld tablets to obtain directions and ensure every choose.

    • What occurs when operators want each palms for dealing with?
    • The way to onboard operators who don’t learn the native language?

    Voice choosing solves this by changing the display with audio directions: the system tells the operator the place to go and what to select, and the operator confirms verbally.

    Illustration of an operator utilizing voice choosing – (Picture by Samir Saci)

    After I was designing provide chain options in logistics firms, vocalisation was the default alternative, particularly for price-sensitive initiatives.

    Primarily based on my expertise, with vocalization, operators’ productiveness can attain 250 bins/hour for retail and FMCG operations.

    The idea just isn’t new. {Hardware} suppliers and software program editors have supplied voice-picking options because the early 2000s.

    However these techniques include vital constraints:

    • Proprietary {hardware} at $2,000 to $5,000 per headset
    • Vendor-locked software program with restricted customisation
    • Lengthy deployment cycles of three to six months per website
    • Inflexible language help that requires retraining for every new language

    For a 50-FTE warehouse, the whole funding reaches $150K to $300K, excluding coaching prices.

    It’s too costly for my clients.

    What in the event you might obtain related outcomes utilizing a smartphone, a custom-made net utility, and trendy AI voice know-how?

    On this article, I’ll present how I constructed a minimalist voice-picking module that integrates with Warehouse Administration Methods, utilizing ElevenLabs for text-to-speech and speech recognition.

    Instance of screens of this app designed for use on a smartphone with a vocal interface – (Picture by Samir Saci)

    This net utility has been deployed within the distribution centre of a small grocery store chain with nice outcomes (the client is glad!).

    The target is to not design options that compete with market leaders, however reasonably to supply a substitute for logistics and manufacturing operations that lack the capability to put money into costly tools and wish customised options.

    Downside Assertion

    Earlier than we get into voice-picking powered by ElevenLabs, let me introduce the logistic operations this AI-powered net utility will help.

    Format of the distribution centre – (Picture by Samir Saci)

    That is the central distribution centre of a small grocery store chain that delivers to 50 shops in Central Europe.

    Format of the warehouse with 10 aisles and 12 pallet positions displayed on the app – (Picture by Samir Saci)

    The ability is organised in a grid structure with aisles (A by L) and positions alongside every aisle:

    • Every location shops a particular merchandise (referred to as SKU) with a identified amount in bins.
    • Operators have to know the place to go and what to anticipate after they arrive.

    What’s the goal? Increase the operators productiveness!

    They weren’t glad in regards to the order allocation and strolling paths supplied by their previous system.

    Options used to optimise choosing operations for this warehouse – (Picture by Samir Saci)

    They first requested to cut back operators’ strolling distance and increase the variety of bins picked per hour utilizing the options presented in this article.

    The answer was an online utility linked to the Warehouse Administration System (WMS) database that guides the operator by the warehouse.

    Operators can examine their choosing record but in addition detailed data per location – (Picture by Samir Saci)

    This visible structure offers a real-time view of what we’ve got within the system, with a greater routing answer.

    Our goal is to go from a productiveness of 75 bins/hour to 200 bins/hour with:

    • A greater order allocation of orders with spatial clustering and pathfinding to minimise the strolling distance per field picked
    • Voice-picking to information operators in a flawless method

    How the Selecting Movement Works

    Earlier than leaping into the vocalisation of the instrument, let me introuce the method of order choosing.

    Three shops despatched orders to the warehouse:

    • Retailer 1 ordered 3 bins of Natural Inexperienced Tea 500g which might be positioned in Location A1
    • Retailer 2 ordered 2 bins of Earl Gray Tea 250g which might be positioned in Location A3
    • Retailer 3 ordered 5 bins of Arabica Espresso Beans 1kg which might be positioned in Location B2

    A choosing batch is a bunch of retailer orders consolidated right into a single work task.

    The operator will put together the three orders in a single batch – (Picture by Samir Saci)

    The system generates a batch with a number of order strains with directions:

    • The place to go (the storage location)
    • What to select (the SKU reference)
    • What number of bins to gather
    Selecting record (left), structure (center), particulars of location (proper) – (Picture by Samir Saci)

    The operator simply has to course of every line sequentially.

    As soon as they affirm a choose, the system advances to the subsequent instruction.

    This sequential movement is crucial as a result of it determines the strolling path by the warehouse utilizing the optimisation algorithms.

    Instance of the unique pathfinding answer (backside) and the optimised (high)

    As it is a {custom} utility, we might implement this optimisation with out counting on an exterior editor.

    Why constructing a {custom} answer? As a result of it’s cheaper and simpler to implement.

    Initially, the client deliberate to buy a business answer and wished me to combine the pathfinding answer.

    After investigation, we found that it might have been costlier to combine the app into the seller answer than to construct one thing from scratch.

    What’s the course of with out the AI-based voice function?

    Handbook Mode: The Display-Primarily based Baseline

    In handbook mode, the operator reads every instruction on display and confirms by tapping a button.

    Two actions can be found at every step:

    • Verify Choose: operator collected the appropriate amount
    • Report Concern: the placement is empty, the amount doesn’t match, or the product is broken
    Our operator has to press the button to verify the choosing or report a problem – (Picture by Samir Saci)

    I constructed the handbook mode as a dependable fallback in case we’ve got points with Elevenlabs.

    But it surely retains the operator’s eyes and one hand tied to the gadget at each step.

    We have to add vocal instructions!

    Voice Mode: Fingers-Free with ElevenLabs

    Now that you understand why we would like the voice mode to switch display interplay, let me clarify how I added two AI-powered parts.

    Technical structure of this utility – (Picture by Samir Saci)

    Textual content-to-Speech: ElevenLabs Reads the Directions

    When the operator begins a choosing session in voice mode, every instruction is transformed to speech utilizing the ElevenLabs API.

    As a substitute of studying “Location A-03-2, choose 4 bins of SKU-1042” on a display, the operator hears a pure voice say:

    “Location Alpha Three Two. Choose 4 bins.”

    ElevenLabs offers a number of benefits over fundamental browser-based TTS:

    • Pure intonation that’s straightforward to grasp in a loud warehouse
    • 29+ languages out there out of the field, with no retraining
    • Constant voice high quality throughout all directions
    • Sub-second technology for brief sentences like choose directions

    However what about speech recognition?

    Speech-to-Textual content: The Operator Confirms Verbally

    After listening to the instruction, the operator walks to the placement, picks the objects, and desires to verify.

    Right here, I made a deliberate design alternative relying on speech recognition and the reasoning capabilities of ElevenLabs.

    Utilizing a single endpoint, we seize the response and match it in opposition to anticipated instructions:

    • “Verify” or “Accomplished” to validate the choose
    • “Downside” or “Concern” to flag a discrepancy
    • “Repeat” to listen to the instruction once more

    The agentic half interprets the operator’s suggestions and tries to match it to the anticipated interactions (CONFIRM, ISSUE, or REPEAT).

    The entire course of from left to proper: Step 1 -> Step 2 -> Step 3 – (Picture by Samir Saci)

    For a multilingual warehouse, it is a vital profit:

    • A Czech operator and a Filipino operator can each obtain directions of their native language from the identical system, with none {hardware} change.
    • I don’t have to contemplate all of the languages attainable within the design of the answer

    Why utilizing ElevenLabs?

    For one more function, the stock cycle rely instrument presented in this video, I’ve used n8n with AI agent nodes to carry out the identical job.

    n8n workflow for the voice-powered stock cycle rely instruments – (Picture by Samir Saci)

    This was working fairly effectively, nevertheless it required a extra advanced setup

    • Two AI nodes: one for the audio transcription utilizing OpenAI fashions, and one AI agent to format the output of the transcription
    • The system prompts had been assuming that the operator was talking English.

    I’ve changed that with a single ElevenLabs endpoint with multi-lingual capabilities.

    Placing each parts collectively, a single choose cycle seems to be like this:

    The Full Voice Selecting Cycle – (Picture by Samir Saci)
    1. The app calls ElevenLabs to generate the audio instruction
    2. The operator hears: “Location Alpha Three Two. Choose 4 bins.”
    3. The operator walks to the placement (palms free, eyes free)
    4. The operator picks the objects and says, “Verify”
    5. The speech recognition endpoint processes the affirmation and strikes to the subsequent choosing location

    All the interplay takes just a few seconds of system time.

    What in regards to the prices?

    That is the place the comparability with conventional techniques turns into putting.

    Comparative examine – (Picture by Samir Saci)

    For this mid-size warehouse with 50 FTEs, they estimated that the normal strategy prices roughly $60K to $150K within the first yr.

    The AI-powered strategy prices just a few API calls.

    The trade-off is evident: conventional techniques supply confirmed reliability and offline functionality for high-volume operations.

    In case of failures, we’ve got the handbook answer as a rollback.

    This AI-powered strategy provides accessibility and velocity for organisations that can’t justify a six-figure funding.

    What Does That Imply for Operations Managers and Choice Makers?

    Voice choosing is not a know-how reserved for the most important 3PLs and retailers with massive budgets.

    In case your warehouse has WiFi and your operators have smartphones, you possibly can prototype a voice-guided choosing system in days.

    It’s straightforward to check it on an actual batch to measure the affect earlier than committing any vital funds for productisation.

    Three eventualities the place this strategy makes explicit sense:

    • Multilingual amenities the place operators battle with screen-based directions in a language that isn’t their very own
    • Multi-site operations the place deploying proprietary {hardware} to each small warehouse just isn’t economically viable
    • Excessive-turnover environments the place coaching time on advanced scanning techniques instantly impacts productiveness

    What about different processes?

    Excellent news, the identical structure extends past choosing.

    Voice-guided workflows can help any course of the place an operator wants directions whereas maintaining their palms free.

    You’ll find a stay demo of a listing cycle counting instrument right here:

    The way to begin this journey?

    As you can simply guess, the entrance finish of those functions has been vibecoded utilizing Lovable and Claude Code.

    For the backend, when you have restricted coding capabilities, I’d counsel beginning with n8n.

    Instance of n8n workflows – (Picture by Samir Saci)

    n8n is a low-code automation platform that allows you to join APIs and AI fashions utilizing visible workflows.

    The preliminary model of this answer has been constructed with this instrument:

    1. I began with a backend linked to a Telegram Bot
    2. Customers had been enjoying with the instrument utilizing this interface
    3. After validation, we moved that to an online utility

    That is the best solution to begin, even with restricted coding expertise.

    I share a step-by-step tutorial with free templates to begin automating from day 1 on this video:

    Let me know what you propose to construct utilizing all these good instruments!

    About Me

    Let’s join on LinkedIn and Twitter. I’m a Provide Chain Engineer who’s utilizing knowledge analytics to enhance logistics operations and cut back prices.

    When you’re searching for tailor-made consulting options to optimise your provide chain and meet sustainability objectives, please contact me.





    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Escaping the Valley of Choice in BI

    June 2, 2026

    Ensuring Data Integrity with Cryptographic Hashing and the Ethereum Blockchain

    June 1, 2026

    RAG Is Not Machine Learning, and the ML Toolkit Solves the Wrong Problem

    June 1, 2026

    How to Combine Claude Code and Codex for Maximum Coding Power

    June 1, 2026

    It’s the Lessons We Learned Along the Way. Or, Is It?

    June 1, 2026

    Proxy-Pointer RAG: Eliminating Wasteful Entity & Relations Extraction in Knowledge Graphs

    May 31, 2026

    Comments are closed.

    Editors Picks

    How small businesses can leverage AI

    June 2, 2026

    Robots-Blog | Humanoide Robotik aus Deutschland: igus bringt neuen Serviceroboter auf den Markt

    June 2, 2026

    GM reimagines Hummer off-roader with California ideas unit

    June 2, 2026

    London’s DEScycle secures over €10 million in grant funding to scale critical metals recovery platform

    June 2, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Build Club launches a free virtual AI school, Campus

    June 1, 2026

    Are prediction markets gambling? Growth blurs lines between finance and betting

    November 8, 2025

    HP OmniBook 5 14 Review: You Won’t Believe How Long This Snapdragon X Laptop Runs

    January 28, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.