Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Electric trucking startup raises $5 million
    • 20 Best Gifts for Men, Manly Men, and Menly Man Men (2026)
    • Honolulu gambling raid in Waimakua Place nets machines
    • Deezer’s Free Tool Scans Your Streaming Playlists for AI-Generated Music
    • Tech Life – Tackling lithium battery fires on planes
    • Can Machine Learning Predict the World Cup?
    • Toyota Corolla GRMN: Nürburgring-proven hot hatch unveiled
    • Ghent-based Sensie raises €500k to bring real-time plant intelligence to greenhouse growers
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Monday, June 15
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»Why My Coding Assistant Started Replying in Korean When I Typed Chinese
    Artificial Intelligence

    Why My Coding Assistant Started Replying in Korean When I Typed Chinese

    Editor Times FeaturedBy Editor Times FeaturedMay 15, 2026No Comments4 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    . Primarily, I work with my coding assistant in Chinese language. Nonetheless, my writing is commonly blended: many engineering phrases are extra acquainted to me in English (particularly phrases we use in python, git, and many others), and a few are even tough to translate naturally into Chinese language.

    Yesterday, I requested my coding assistant in Chinese language:“run.py有早停吗?我在恒源云上跑,发现没有触发”, that means, “Does run.py implement early stopping? I used to be operating the undertaking on a shared GPU service, and I didn’t see early stopping triggered.” As normal, I naturally typed the technical token run.py in its unique English kind. The mannequin inspected the code and responded with the next:

    Picture by creator: Screenshot of coding assistant replying in Korean

    All technical tokens remained in English (run.py, config.py, train_unified), whereas the explanatory construction shifted into Korean. This isn’t a singular case. It has occurred now and again: so long as I blended Chinese language and English engineering phrases, Korean all the time appeared.

    Picture by creator: One other screenshot of coding assistant replying in Korean

    This made me ask: Is that this a language problem, or one thing deeper within the embedding house?

    Speculation

    Embedding areas usually are not primarily structured by the character of languages. Having been educated alongside language fashions, they are typically organized by process registers equivalent to tutorial writing, conversational textual content, and, within the case of coding assistants, engineering/code. Chinese language, though spoken by the biggest inhabitants on the planet, just isn’t a pure medium for the engineering register and has restricted illustration in technical corpora.

    In such a context, textual content might cease behaving like “Chinese language” within the embedding house as quickly as engineering tokens equivalent to evaluation / department / commit / PR / diff seem. As a substitute, it might drift into an engineering attractor discipline.

    We’ll conduct some experiments to offer empirical proof for this speculation.

    Managed Language Drift

    We assemble the next managed sequence of sentences the place English phrases take over Chinese language ones steadily:

    Stage 0: 请帮我检查这个分支
    Stage 1: 请帮我 evaluation 这个分支
    Stage 2: 请帮我 evaluation 这个 department
    Stage 3: Please evaluation this department pull request commit
    Stage 4: Please evaluation this department pull request commit code diff

    We now compute similarity utilizing cosine similarity between sentence embeddings. We outline Korean and English “clusters” as the common embedding of a small set of consultant engineering-related sentences in every language. We use Δ (EN − KO) to indicate the distinction between English and Korean similarity scores, i.e., Δ = similarity(English) − similarity(Korean).

    Stage Korean similarity English similarity Δ (EN − KO)
    0 0.4783 0.5141 0.0358
    1 0.5235 0.5728 0.0492
    2 0.5474 0.6140 0.0665
    3 0.5616 0.7314 0.1698
    4 0.5427 0.7398 0.1972

    We noticed an attention-grabbing phenomenon: Korean similarity will increase first and is later overtaken by English similarity. Furthermore, the expansion in English similarity is non-linear, suggesting a phase-transition–like conduct somewhat than gradual drift.

    When projecting the embeddings into two dimensions utilizing PCA, we observe a clean trajectory within the early phases, adopted by a pointy directional bounce between Stage 2 and Stage 3, and subsequent stabilization. This sample signifies that embeddings don’t transfer linearly via house; as an alternative, they seem to transition between attractor basins.

    Picture by creator: Managed Drift Trajectory in PAC house

    Actual-world Mannequin Habits

    Take into account once more the sentence we talked about firstly. I requested:

    A. “run.py有早停吗?我在恒源云上跑,发现没有触发”, that means “Does run.py implement early stopping? I used to be operating the undertaking on a shared GPU service, and I didn’t see early stopping triggered.”

    B. “원인을 찾았습니다. 결론: run.py에는 실제로 조기 종료가 없습니다. config.py에 USE_EARLY_STOPPING = True” (in Korean).

    Translated again into Chinese language, we’ve got:

    C. “我找到了原因。结论:run.py实际上没有早停。config.py里有 USE_EARLY_STOPPING = True。”

    We compute the similarities of A, B, and C utilizing cosine similarity between sentence embeddings. For comparability, we outline three reference clusters: the Chinese language cluster as the common embedding of basic Chinese language natural-language sentences, and the corresponding English and Korean clusters.

    Textual content Korean sim English sim Chinese language sim
    A. (Chinese language immediate) 0.2003 0.2688 0.3134
    B. (Korean response) 0.2745 0.2983 0.1641
    C. (Translated Chinese language) 0.1634 0.3106 0.2798

    As you may see, translating the Korean response again into Chinese language doesn’t ship the embedding again to the Chinese language area. As a substitute, it strikes even nearer to the English clusters.

    This implies: Translation might restore language kind, however most likely not embedding location.

    Conclusion

    Each experiments give the identical conclusion: the embedding house just isn’t organized by language boundaries. As a substitute, it’s extra seemingly structured by process natures, the place engineering English dominates.
    When a sentence enters this area, language kind might change, however the embedding construction stay within the engineering basin, resulting in bizarre behaviors equivalent to replying in Korean even in case you are by no means a Korean speaker.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Can Machine Learning Predict the World Cup?

    June 9, 2026

    Automate Writing Your LLM Prompts

    June 5, 2026

    My AI Couldn’t See My Files — I Built a Zero-Dependency MCP Server

    June 5, 2026

    The Fundamental Choice in Reinforcement Learning: On‑Policy vs. Off‑Policy

    June 5, 2026

    How to Fine-Tune an SLM for Emotion Recognition

    June 5, 2026

    FPN Paper Walkthrough: Leveraging the Internal Pyramid

    June 5, 2026

    Comments are closed.

    Editors Picks

    Electric trucking startup raises $5 million

    June 15, 2026

    20 Best Gifts for Men, Manly Men, and Menly Man Men (2026)

    June 14, 2026

    Honolulu gambling raid in Waimakua Place nets machines

    June 13, 2026

    Deezer’s Free Tool Scans Your Streaming Playlists for AI-Generated Music

    June 12, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    OnePlus and Oppo to Raise Smartphone Prices as Memory Costs Climb

    March 11, 2026

    Policing the bots: How new rules could save the web from AI scrapers

    May 11, 2026

    How AI is used to surveil workers

    March 7, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.