of Contents
đPython Notebook
đŻIntroduction
đExample ABC Agent Search Progress
âłAgent Lifecycle in Swarm Optimization
đThe 3 Bee Agent Roles
đŞťIris Dataset
â Clustering â No labels? No problem!
đď¸Fitness Model for Clustering
đ¤Confusion Matrix as a Diagnostic Tool
đRunning the Agentic AI Loop
đReporting Results
đŹDesigning Agent Prompts for Gemini
â ď¸Gemini Agentic AI Issues
âď¸Agentic AI Competitive Landscape towards 2026
â¨Conclusion and Future Work
đPython Pocket book
Discover my interactive pocket book on Google Colab â and be at liberty to attach with me on LinkedIn for any questions or suggestions.
đŻ Introduction
With the unimaginable innovation occurring round Agentic AI, I needed to get palmsâon with a mission that integrates LLM prompts right into a Information Science workflow. The Synthetic Bee Colony (ABC) algorithm is impressed by honey beesâ foraging habits and works remarkably effectively in nature. It belongs to the household of swarm intelligence algorithms, designed for decentralized determinationâmaking processes whereby âbee brokersâ pursue their particular person targets autonomously, whereas collectively bettering the standard of the general answer (the âhoneypotâ).Â
This fashionable approach has been extensively utilized to many fields, particularly: scheduling, routing, power optimization, useful resource allocation and anomaly detection. Researchers typically mix ABC with neural networks in a hybrid strategy, for instance, utilizing ABC to tune hyperparameters or optimize mannequin weights. The algorithm is especially related when knowledge is scarce or when the issue is combinatorial â when the answer house grows exponentially (and even factorially) with the variety of options.
On this mission, my strategy has been to imitate Swarm Optimization for an Adaptive Grid Search. The artistic twist is that I utilized Googleâs new Agentic AI instruments to implement the bee brokers. Within the ABC algorithm, there are three sorts of autonomous bee brokers, and I outlined their roles utilizing textual content prompts powered by the newest Gemini LLMs.
Every foraging cycle (algorithm iteration) proceeds as follows:
- Scout bees discover â uncover new meals sources (candidate options).
- Employed bees exploit â refine these sources and dance to share data concerning the high quality of the nectar (health perform).
- Onlooker bees exploit additional â guided by the dances, they reinforce the colonyâs concentrate on one of the best meals sources.
đInstance ABC Agent Search Progress
âłAgent Lifecycle in Swarm Optimization
The ABC algorithm was first proposed by DerviĹ KaraboÄa in 2005. In my modernized metaâheuristic adaptation, I centered on the objective of bettering clustering efficiency for an unsupervised dataset.
Beneath are the Python lessons I applied:
- WebResearcher: Chargeable for researching and summarizing scikit-learn clustering algorithms and their key hyperparameters. The knowledge gathered is essential for producing correct and efficient prompts for the bee brokers, and this class is applied as an LLMâprimarily based agent.
- ScoutBeeAgent: Generates numerous preliminary candidate clustering options for the Iris dataset, leveraging the parameter summaries supplied by the WebResearcher.
- EmployedBeeAgent: Refines present candidate options by exploring native parameter neighborhoods, utilizing the WebResearcherâs insights to make knowledgeable changes.
- OnlookerBeeAgent: Evaluates the generated and refined candidates, choosing probably the most promising ones to hold ahead to the following iteration.
- Runner: Orchestrates the general ABC optimization loop, organizing and coordinating the Gemini AI agent circulate. It manages sequencing between the totally different bee brokers and tracks world progress. Whereas the Runner ensures construction and oversight, every bee agent operates in a completely distributed and autonomous method, independently performing its specialised duties with out centralized management.
- FitnessModel: Evaluates the standard of every candidate answer utilizing the Adjusted Rand Index (ARI), with the target of minimizing 1 â ARIÂ to realize higher clustering options.Â
- Reporter: Visualizes the convergence of one of the best ARI values over iterations and compares the highestâperforming options in opposition to baseline clustering fashions.
đThe three Bee Agent Roles
The brokers decide parameter values and ranges by pure language prompts supplied to the Gemini generative AI mannequin. All three brokers inherit from the BeeAgent base class, which handles shared setup and candidate monitoring. A part of every immediate is knowledgeable by the WebResearcher, which summarizes scikit-learn clustering algorithms algorithms and their key hyperparameters to make sure accuracy and relevance. Right hereâs how every agent works:
- đScoutBeeAgent (Preliminary Parameter Era): Constructs prompts that permit the LLM some creativity inside outlined constraints. The allowed_algorithms parameter guides which fashions to think about from the favored clustering algorithms in scikitâbe taught. The Gemini mannequin interprets these directions and generates numerous candidate options, guaranteeing no duplicates and balanced distribution throughout algorithms.
- đEmployedBeeAgent (Parameter Refinement): Generates prompts with refining directions, directing the LLM to regulate parameters by roughly Âą10â20%, stay inside legitimate ranges, and keep away from inventing unsupported parameters. It takes the present options and applies these guidelines to create barely assorted (refined) candidates throughout the native neighborhood of the present parameter house.
- đOnlookerBeeAgent (Analysis and Choice): Produces prompts that consider the candidates generated and refined by the opposite brokers. Utilizing a health rating primarily based on the Adjusted Rand Index (ARI), it selects the highestâokay promising options, maintains algorithm range, and avoids duplicates. This reinforces the colonyâs concentrate on the strongest candidates.
In essence, the Python code defines the duty objective, parameters, constraints, and return values as textual content throughout the prompts. The generative AI mannequin (Gemini) then âreadsâ and âunderstandsâ these directions to supply or modify the precise numerical and categorical parameter values for the clustering algorithms. Completely different LLMs might reply in a different way to refined adjustments within the enter textual content, so you will need to experiment with the wording of prompts for the three agent lessons. To refine the wording additional, you possibly can at all times seek the advice of your most well-liked LLM.
đŞťIris Dataset
A pure selection for this examine is Sir Ronald Fisherâs basic Iris flower dataset, launched in his 1936 paper. Within the subsequent sections, this dataset is utilized as a small, effectivelyâoutlined demonstration case as an example how the proposed ABC optimization methodology will be utilized throughout the context of a clustering drawback.
The Iris dataset (License : CC0 1.0) includes 150 labeled samples, every belonging to one among 3 Iris lessons: Iris Setosa, Iris Versicolor, Iris Virginica. Every flower pattern is related to 4 numeric options: Sepal size, Sepal width, Petal size, Petal width.



As proven in each the pairwise relationship plots and the mutual data functionâsignificance plots, petal size and petal width are by far probably the most informative options when measured in opposition to the goal labels of the Iris dataset.
Mutual Info (MI) is computed functionâclever with respect to the labels, whereas the Adjusted Rand Index (ARI), used on this mission for health analysis, measures the settlement between two partitions (predicted cluster labels versus true labels). Word that even when function choice is utilized, since Iris Versicolor and Iris Virginica share comparable petal lengths and widths, their clusters overlap in function house. In consequence, the ARI will be robust however can’t attain an ideal rating of 1.0.
â Clustering â No labels? No drawback!
Clustering algorithms are a cornerstone of unsupervised studying and so I selected to concentrate on the objective of blindly figuring out the flower lessons primarily based solely on their options. In different phrases, the mannequin was not skilled on the flower labels; these labels had been used solely to validate efficiency metrics. Conventional clustering algorithms similar to KMeans or DBSCAN typically wrestle with parameter sensitivity and dataset variability. Due to this fact, a meta-heuristic like ABC, which balances exploration vs exploitation, seems promising.
Word that in clustering algorithms, parameters ought to technically be known as hyperparameters, as a result of theyâre not realized from the information throughout coaching (as weights in a neural community or regression coefficients are) however they’re set externally. Nonetheless, for brevity, they’re sometimes called parameters.
Right hereâs a concise visible comparability of various clustering algorithms utilized to a number of toy datasets, totally different colours characterize totally different clusters that every algorithm discovered for 2D representations:

Within the basic Iris dataset, the 2 most comparable species â versicolor and virginica â typically pose a problem for clustering algorithms. Many strategies mistakenly group them right into a single cluster, treating them as one steady dense area. In distinction, the extra distinct setosa species is persistently recognized as a separate cluster.
Desk evaluating a number of fashionable clustering algorithms out there within the scikitâbe taught library:
| Algorithm | Abstract | Key Hyperparameters | Effectivity | Accuracy |
| KMeans | Centroid-based, partitions knowledge into okay spherical clusters; easy and quick. | n_clusters, init, n_init, max_iter, random_state, tol | Quick on mediumâmassive datasets; scales effectively; advantages from a number of restarts. | Robust for well-separated, convex clusters; poor on non-convex or varying-density shapes. |
| DBSCAN | Density-based, finds arbitrarily formed clusters and marks noise while not having okay. | eps, min_samples, metric, leaf_size | Reasonable; slower in excessive dimensions; environment friendly with spatial indexing. | Glorious for irregular shapes and noise; delicate to eps and density variations. |
| Agglomerative (Hierarchical) | Builds a dendrogram by iteratively merging clusters; no fastened okay till reduce. | n_clusters, affinity, linkage, distance_threshold | Slower (typically O(n²)); memory-heavy for big n. | Good structural discovery; linkage selection impacts outcomes; handles non-spherical clusters. |
| Gaussian Combination Fashions (GMM) | Probabilistic combination of Gaussians utilizing EM (Expectation Maximization); comfortable assignments. | n_components, covariance_type, tol, max_iter, n_init, random_state | Reasonable; EM will be expensive with full covariance. | Excessive when knowledge is near-Gaussian; versatile shapes; threat of overfitting with out constraints. |
| Spectral clustering | Graph-based; embeds knowledge by way of eigenvectors earlier than clustering (typically KMeans). | n_clusters, assign_labels, n_neighbors, random_state, affinity | Sluggish on massive n on account of eigen-decomposition; finest for smallâmedium units. | Robust for manifold/complicated buildings; high quality hinges on graph development and affinity. |
| MeanShift | Mode-seeking by way of kernel density; no have to predefine okay. | bandwidth, cluster_all, max_iter, n_jobs | Sluggish; costly with many factors/options. | Good for locating cluster modes; efficiency extremely depending on bandwidth selection. |
OkayâMeans as a Fundamental Clustering Instance
OkayâMeans is among the many most generally used clustering algorithms, valued for its simplicity and effectivity. Due to its prevalence, I’ll define it right here in additional element as a consultant instance of how clustering is usually carried out. Its recognition comes from its simplicity and effectivity, although it does have limitations. A key downside is that the variety of clusters okay have to be specified upfront.
How OkayâMeans Works
- Initialize Centroids:
Choose okay beginning centroids, both randomly or with smarter methods like OkayâMeans++, which spreads them out to enhance clustering high quality. - Assign Factors to Clusters:
Symbolize every knowledge level as an n-dimensional vector, the place every element corresponds to at least one function. Assign factors to the closest centroid utilizing a distance metric (generally Euclidean). In excessiveâdimensional areas, this step is difficult by the Curse of Dimensionality, the place distances lose discriminative energy. - Replace Centroids & Repeat:
Recompute every centroid because the imply of all factors in its cluster, then reassign factors to the closest centroid. Repeat till assignments stabilize â that is convergence.
Sensible Concerns
- Curse of Dimensionality: In very excessive dimensions, distance metrics grow to be much less efficient, lowering clustering reliability.Â
- Dimensionality Discount: Methods like PCA or tâSNE are sometimes utilized earlier than OkayâMeans to simplify the function house and enhance outcomes.
- Selecting Okay: Strategies such because the Elbow Technique, Silhouette Rating, or metaâheuristics (e.g., ABC optimization) assist estimate the optimum variety of clusters.
đď¸Health Mannequin for Clustering
The FitnessModel evaluates clustering candidate options on a dataset. The objective of clustering algorithm is to supply clusters that ideally map carefully to the true lessons however normally itâs not an ideal match. ARI (Adjusted Rand Index) is used to measure the similarity between two clusterings (predicted vs. floor fact) â it’s a extensively used metric for evaluating clustering efficiency as a result of it corrects for likelihood settlement, works throughout totally different clustering algorithms, and supplies a transparent scale from â1 to +1 thatâs straightforward to interpret.
| ARI Vary | Which means | Typical Edge Case Situation |
| +1.0 | Good settlement | Predicted clustering precisely matches floor fact labels |
| â 0.0 | Random clustering (likelihood stage) | â Assignments are random- All factors pressured into one cluster (except floor fact can be one cluster) |
| < 0.0 | Worse than random | â Systematic disagreement (clusters persistently mismatched or flipped)- Every level its personal cluster when floor fact is totally different |
| Low/Adverse (near â1) | Robust disagreement | Excessive imbalance or mislabeling throughout clusters |
Health = 1 â ARI, so decrease health is best. This permits ABC to instantly optimize clustering high quality. Proven under is an instance run for the preliminary iterations of an ABC with Gemini Brokers that I developed together with a preview of the LLM uncooked response texts. Word how the GMM (Gaussian Combination Fashions) steadily improves as new candidates are chosen on every iteration by the totally different bee brokers. Consult with the Google Colab pocket book for the logs for extra iterations.
Beginning ABC run with Health Mannequin for dataset: Iris
  Options: 4, Courses: 3
  Baseline Fashions (ARI): {'DBSCAN': 0.6309344087637648, 'KMeans': 0.6201351808870379, 'Agglomerative': 0.6153229932145449, 'GMM': 0.5164585360868599, 'Spectral': 0.6451422031981431, 'MeanShift': 0.5681159420289855}
Runner: Initiating Scout Agent for preliminary options...
  Scout Producing preliminary candidate options...
  Scout         : Sending immediate to Gemini mannequin... n_candidates=12
  Scout         : Obtained response from Gemini mannequin.
  Scout         : Uncooked response textual content: ```json[{"model":"KMeans","params":{"n_clusters":3,"init":"k-means++","n_init":10,"random_state":42}},{"model":"KMeans","params":{"n_clusters":4,"init":"random","n_init":10,"random_state":42}},{"model":"KMeans","params":{"n_clusters":5,"init":"k-mean...
  Scout         : Initial candidates generated.
Runner: Scout Agent returned 12 initial solutions.
Runner: Starting iteration 1/8...
Runner: Agents completed actions for iteration 1.
--- Iteration 1 Details ---
  GMM Candidate 1 (Origin: Scout-10010)  : Best previous ARI=0.820, Current ARI=0.820, Params: {'n_components': 4, 'covariance_type': 'tied', 'max_iter': 100, 'random_state': 42}
  KMeans Candidate 2 (Origin: Scout-10000): Best previous ARI=0.620, Current ARI=0.620, Params: {'n_clusters': 3, 'init': 'k-means++', 'n_init': 10, 'random_state': 42}
  DBSCAN Candidate 3 (Origin: Scout-10004): Best previous ARI=0.550, Current ARI=0.550, Params: {'eps': 0.7, 'min_samples': 4}
  GMM Candidate 4 (Origin: Scout-10009)  : Best previous ARI=0.820, Current ARI=0.516, Params: {'n_components': 3, 'covariance_type': 'full', 'max_iter': 100, 'random_state': 42}
  KMeans Candidate 5 (Origin: Scout-10001): Best previous ARI=0.620, Current ARI=0.462, Params: {'n_clusters': 4, 'init': 'random', 'n_init': 10, 'random_state': 42}
  DBSCAN Candidate 6 (Origin: Scout-10003): Best previous ARI=0.550, Current ARI=0.442, Params: {'eps': 0.5, 'min_samples': 5}
  KMeans Candidate 7 (Origin: Scout-10002): Best previous ARI=0.620, Current ARI=0.435, Params: {'n_clusters': 5, 'init': 'k-means++', 'n_init': 5, 'random_state': 42}
  DBSCAN Candidate 8 (Origin: Scout-10005): Best previous ARI=0.550, Current ARI=0.234, Params: {'eps': 0.4, 'min_samples': 6}
*** Global Best so far: ARI=0.820, Candidate={'model': 'GMM', 'params': {'n_components': 4, 'covariance_type': 'tied', 'max_iter': 100, 'random_state': 42}, 'origin_agent': 'Scout-10010', 'current_ari_for_display': 0.8202989638185834}
-----------------------------
Runner: Starting iteration 2/8...
  Scout Generating initial candidate solutions...
  Scout         : Sending prompt to Gemini model... n_candidates=12
  Employed Refining current solutions...
  Employed       : Sending prompt to Gemini model... n_variants=12
  Onlooker Evaluating candidates and selecting promising ones...
  Onlooker       : Sending prompt to Gemini model... top_k=5
  Scout         : Received response from Gemini model.
  Scout         : Raw response text: ```json[{"model":"KMeans","params":{"n_clusters":3,"init":"k-means++","n_init":10,"random_state":42}},{"model":"KMeans","params":{"n_clusters":4,"init":"random","n_init":10,"random_state":42}},{"model":"KMeans","params":{"n_clusters":5,"init":"k-mean...
  Scout         : Initial candidates generated.
  Employed       : Received response from Gemini model.
  Employed       : Raw response text: ```json[{"model":"GMM","params":{"n_components":5,"covariance_type":"tied","max_iter":100,"random_state":42}},{"model":"GMM","params":{"n_components":3,"covariance_type":"full","max_iter":100,"random_state":42}},{"model":"KMeans","params":{"n_cluster...
  Employed       : Solutions refined.
  Onlooker       : Received response from Gemini model.
  Onlooker       : Raw response text: ```json[{"model":"GMM","params":{"n_components":4,"covariance_type":"tied","max_iter":100,"random_state":42}},{"model":"KMeans","params":{"n_clusters":3,"init":"k-means++","n_init":10,"random_state":42}},{"model":"DBSCAN","params":{"eps":0.7,"min_sam...
  Onlooker       : Promising candidates selected.
Runner: Agents completed actions for iteration 2.
--- Iteration 2 Details ---
  GMM Candidate 1 (Origin: Scout-10022)  : Best previous ARI=0.820, Current ARI=0.820, Params: {'n_components': 4, 'covariance_type': 'tied', 'max_iter': 100, 'random_state': 42}
  GMM Candidate 2 (Origin: Scout-10010)  : Best previous ARI=0.820, Current ARI=0.820, Params: {'n_components': 4, 'covariance_type': 'tied', 'max_iter': 100, 'random_state': 42}
  GMM Candidate 3 (Origin: Onlooker-30000): Best previous ARI=0.820, Current ARI=0.820, Params: {'n_components': 4, 'covariance_type': 'tied', 'max_iter': 100, 'random_state': 42}
  GMM Candidate 4 (Origin: Employed-20007): Best previous ARI=0.820, Current ARI=0.820, Params: {'n_components': 4, 'covariance_type': 'tied', 'max_iter': 80, 'random_state': 42}
  GMM Candidate 5 (Origin: Employed-20006): Best previous ARI=0.820, Current ARI=0.820, Params: {'n_components': 4, 'covariance_type': 'tied', 'max_iter': 120, 'random_state': 42}
  GMM Candidate 6 (Origin: Employed-20000): Best previous ARI=0.820, Current ARI=0.693, Params: {'n_components': 5, 'covariance_type': 'tied', 'max_iter': 100, 'random_state': 42}
  KMeans Candidate 7 (Origin: Scout-10012): Best previous ARI=0.620, Current ARI=0.620, Params: {'n_clusters': 3, 'init': 'k-means++', 'n_init': 10, 'random_state': 42}
  KMeans Candidate 8 (Origin: Scout-10000): Best previous ARI=0.620, Current ARI=0.620, Params: {'n_clusters': 3, 'init': 'k-means++', 'n_init': 10, 'random_state': 42}
*** Global Best so far: ARI=0.820, Candidate={'model': 'GMM', 'params': {'n_components': 4, 'covariance_type': 'tied', 'max_iter': 100, 'random_state': 42}, 'origin_agent': 'Scout-10010', 'current_ari_for_display': 0.8202989638185834}

While the Adjusted Rand Index (ARI) provides a single score for clustering quality, the Confusion Matrix reveals where misclassifications occur by showing how true classes are distributed across predicted clusters.
In the Iris dataset, scikitâlearn encodes the species in a fixed order:Â
0 = Setosa, 1 = Versicolor, 2 = Virginica.Â
Even though there are only three true species, the algorithm below mistakenly produced four clusters. The matrix illustrates this mismatch:
[[ 0Â 6 44Â 0]
[ 2Â 0Â 0 48]
[49 0Â 0Â 1 ]
[ 0Â 0Â 0Â 0 ]]
â ď¸ Word: The order of the columns (clusters) doesn’t essentially correspond to the order of the rows (true lessons). Cluster IDs are arbitrary labels assigned by the algorithm, they usually donât carry any inherent that means.Â
Row-by-row Interpretation (row and column IDs begin from 0)
- Row 0: [ 0 6 44 0]
Setosa class â Its samples fall solely into columnsâŻ1 andâŻ2, with no overlap with Versicolor or Virginica. These two columns ought to actually have been acknowledged as a single cluster equivalent to Setosa. - Row 1: [ 2 0 0 48]
Versicolor class â Cut up between columnsâŻ0 andâŻ3, exhibiting that the algorithm didn’t isolate Versicolor cleanly. - Row 2: [49 0 0 1]
Virginica class â Additionally cut up between columnsâŻ0 andâŻ3, overlapping with Versicolor somewhat than forming its personal distinct cluster. - Row 3: [ 0 0 0 0]
Further mistaken cluster â No true samples right here, reflecting that the algorithm produced 4 clusters for a dataset with solely 3 lessons.
đThe confusion matrix exhibits that Setosa is distinct (its clusters donât overlap with the opposite species), whereas Versicolor and Virginica should not separated cleanly â each are unfold throughout the identical two clusters (columnsâŻ0 andâŻ3). This overlap highlights the algorithmâs issue in distinguishing between them. The confusion matrix makes these misclassifications seen in a method {that a} single ARI rating can’t.
đOperating the Agentic AI Loop
The Runner orchestrates iterations:
- Scout bees suggest numerous options.
- Employed bees refine them.
- Onlooker bees choose promising ones.
- The answer pool is up to date.
- One of the best ARI per iteration is tracked.
Within the Runner class and all through the Synthetic Bee Colony (ABC) algorithm, a candidate refers to a selected clustering mannequin along with its outlined parameters. Within the instance within the answer pool proven under, two candidates are returned.
Candidates are orchestrated utilizing pythonâs concurrent.futures.ThreadPoolExecutor, which permits parallel execution. In consequence, the ScoutAgent, EmployedBeeAgent, and OnlookerBeeAgent are run asynchronously in separate threads throughout every iteration of the algorithm.
The runner.run() methodology returns two objects:
solution_pool: This can be a checklist of the pool_size most promising candidates (every being a dictionary containing a mannequin and its parameters) discovered throughout all iterations. This checklist is sorted by health (ARI), so the very first ingredient, solution_pool[0], will characterize the best-fitting mannequin and its particular parameters that the ABC algorithm found.
best_history: This can be a checklist that tracks solely one of the best Adjusted Rand Index.
For instance:
solution_pool = [
    {
        "model": "KMeans",
        "params": {"n_clusters": 3, "init": "k-means++"},
        "origin_agent": "Employed",
        "current_ari_for_display": 0.742
    },
    {
        "model": "AgglomerativeClustering",
        "params": {"n_clusters": 3, "linkage": "ward"},
        "origin_agent": "Onlooker",
        "current_ari_for_display": 0.715
    }
]
best_history = [
    {"ari": 0.642, "model": "KMeans", "params": {"n_clusters": 3, "init": "random"}},
    {"ari": 0.742, "model": "KMeans", "params": {"n_clusters": 3, "init": "k-means++"}}
]
Answer Pool Setup with ThreadPoolExecutor
ThreadPoolExecutor(): Initializes a pool of employee threads that may execute duties concurrently.Â
ex.submit(âŚ): Submits every agentâs act methodology as a separate process to the thread pool.
from concurrent.futures import ThreadPoolExecutor
import copy
# ... inside Runner.run() ...
for it in vary(iterations):
    print(f"Runner: Beginning iteration {it+1}/{iterations}...")
    if it == 0:
        outcomes = []
    else:
        # Use threads as a substitute of processes
        with ThreadPoolExecutor() as ex:
            futures = [
                ex.submit(self.scout.act),
                ex.submit(self.employed.act, solution_pool),
                ex.submit(self.onlooker.act, solution_pool)
            ]
            outcomes = [f.result() for f in futures]
    print(f"Runner: Brokers accomplished actions for iteration {it+1}.")
    # ... remainder of the loop unchanged ...
Every agentâs act methodology is dispatched to the thread pool, permitting them to run in parallel. The decision to f.end result() ensures that the Runner waits for all duties to complete earlier than transferring ahead.
This design achieves two issues:
- Parallel execution inside an iteration â brokers act concurrently, mimicking actual bee colony habits.
- Sequential iteration management â the Runner solely advances as soon as all brokers have accomplished their work, conserving the general loop orderly and deterministic.
From the Runnerâs perspective, iterations nonetheless seem sequential, however internally every iteration advantages from concurrent execution of agent duties.
Answer Pool Setup with ProcessPoolExecutor
Whereas ThreadPoolExecutor supplies concurrency by threads, it may be seamlessly changed with ProcessPoolExecutor to realize true parallel CPU execution.
With ProcessPoolExecutor, every agent runs in its personal separate course of, which bypasses Pythonâs GIL (International Interpreter Lock). The GIL is a mutex (mutual exclusion lock) that ensures just one thread executes Python bytecode at a time, even on multiâcore programs. Through the use of processes as a substitute of threads, heavy numerical workloads can totally leverage a number of CPU cores, enabling real parallelism and improved efficiency for computeâintensive duties.
from concurrent.futures import ProcessPoolExecutor
import copy
# ... inside Runner.run() ...
for it in vary(iterations):
print(f"Runner: Beginning iteration {it+1}/{iterations}...")
if it == 0:
outcomes = []
else:
# Use processes as a substitute of threads
with ProcessPoolExecutor() as ex:
futures = [
ex.submit(self.scout.act),
ex.submit(self.employed.act, solution_pool),
ex.submit(self.onlooker.act, solution_pool)
]
outcomes = [f.result() for f in futures]
print(f"Runner: Brokers accomplished actions for iteration {it+1}.")
# ... remainder of the loop unchanged ...
Key Variations between ProcessPoolExecutor vs ThreadPoolExecutor
- ProcessPoolExecutor launches separate python processes, not threads.
- Every agent runs independently on a distinct CPU core.
- This avoids the GIL, so CPUâcertain duties (like clustering, health analysis, numerical optimization) actually run in parallel. A CPUâcertain process is any computation the place the limiting issue is the processorâs pace somewhat than ready for enter/output (I/O).
- Since processes run in separate reminiscence areas, they’llât instantly share objects. As an alternative, something handed between them have to be serialized (pickled). Easy python objects like dictionaries, lists, strings, and numbers are picklable, so candidate dictionaries will be exchanged safely.
đKey Takeaway:
â Use ProcessPoolExecutor in case your brokers do heavy computation (matrix ops, clustering, ML coaching).Â
â Persist with ThreadPoolExecutor in case your brokers are principally I/Oâcertain (ready for knowledge, community, disk).
Why are a few of the candidate parameter values repeated in numerous iterations?
The repetition of candidate parameter values throughout iterations is a pure final result of how the Synthetic Bee Colony algorithm works and the way the brokers work together:
Scout Bee Agentâs Exploration: The ScoutBeeAgent is tasked with producing new and numerous candidate options. Whereas it goals for range, given a restricted parameter house or if the generative mannequin finds sure parameter combos persistently efficient, it’d recommend comparable options in numerous iterations.
Employed Bee Agentâs Exploitation: The EmployedBeeAgent refines present promising options. If an answer is already superb or near an optimum configuration, the ânative neighborhoodâ exploration (e.g., adjusting parameters by Âą10-20%) may lead again to the identical or very comparable parameter values, particularly after rounding or if the parameter changes are small.
Onlooker Bee Agentâs Choice: The OnlookerBeeAgent selects the top_k most promising options from a bigger set of candidates (which incorporates newly scouted, refined by employed, and beforehand promising options). If the algorithm is converging, or if a number of distinct options yield very comparable high-fitness scores, the OnlookerBeeAgent may repeatedly choose parameter units which can be successfully an identical from one iteration to the following.
Answer Pool Administration: The Runner maintains a solution_pool of a hard and fast pool_size. It kinds this pool by health and retains one of the best ones. If the highest options stay persistently the identical, or if new good options are an identical to earlier ones, these parameter units will persist and thus be ârepeatedâ within the iteration particulars.
Convergence: Because the ABC algorithm progresses, itâs anticipated to converge in direction of optimum or near-optimal options. This convergence typically signifies that the search house narrows, and brokers repeatedly discover the identical high-performing parameter configurations except some form of pruning methodology (like deduplication) is utilized.
đReporting Outcomes
Benchmarking Customary Clustering Algorithms
Earlier than making use of ABC, it’s helpful to determine a baseline by evaluating the efficiency of ordinary clustering strategies. I ran a comparability benchmark utilizing default configurations for the next algorithms:
- KMeans
- DBSCAN
- Agglomerative Clustering
- Gaussian Combination Fashions (GMM)
- Spectral Clustering
- MeanShift
As proven within the Google Colab pocket book, the ABC brokers found parameter units that considerably improved the Adjusted Rand Index (ARI), lowering misclassifications between the carefully associated lessons Versicolor and Virginica.
Reporter Outputs
The Reporter class is chargeable for producing ultimate analysis outputs after operating the Synthetic Bee Colony (ABC) optimization. It supplies three primary features:
- Comparability Desk
- Compares every candidate answerâs Adjusted Rand Index (ARI) in opposition to baseline clustering fashions.
- Reviews the advance (candidate_ari â baseline_ari).
- Confusion Matrix Show
- Prints the confusion matrix of one of the best candidate answer to indicate class-level efficiency and misclassifications.
- Convergence Visualization
- Plots the development of one of the best ARI throughout iterations.
- Annotates the plot with mannequin names and parameters for every iteration.
đŹDesigning Agent Prompts for Gemini
I made a decision to design every agentâs immediate with the next template for a structured strategy:
⢠Job Purpose: What the agent should obtain.
⢠Parameters: Inputs like dataset title, variety of candidates for the agent sort, allowed algorithms and the hyperparameter enter dictionary returned by the WebResearcher by way of its LLM immediate.
⢠Constraints: Guarantee every candidate is exclusive, preserve balanced distribution throughout algorithms, require hyperparameters to remain inside legitimate ranges.
⢠Return Values: JSON checklist of candidate options.
To make sure deterministic LLM habits, I used this generation_config. Particularly, observe that specifying a temperature of zero leaves the mannequin with no room for creativity between prompts and easily repeats the earlier response.
generation_config={
        "temperature": 0.0,
        "top_p": 1.0,
        "top_k": 1,
        "max_output_tokens": 4096
    }
res = genai_model.generate_content(immediate, generation_config=generation_config)
Whereas growing new code like on this mission, you will need to be certain that for a similar enter, you get the identical output.
â ď¸Gemini Agentic AI Points
Gemini AI Mannequin Sorts
- Lite (FlashâLite): Prioritize pace and value effectivity. Supreme for bulk duties like translation or classification.Â
- Flash: Nicelyâfitted to manufacturing workloads requiring scale and average reasoning.
- Professional: The flagship tier â finest for complicated reasoning, multimodal comprehension (textual content, pictures, audio, video), and agentic AI use circumstances.
Why Prompts Alone Fail in Lite Fashions
I bumped into a typical limitation for the âLiteâ fashions:
LLMs donât reliably obey directions like âat all times embody these parametersâ simply since you put them within the immediate. As of at this time, fashions typically revert to defaults or minimal units except construction is enforced after era. Why the specific immediate nonetheless failed:
- Pure language directions are weak constraints. Even âat all times embody precisely these parametersâ is interpreted probabilistically.
- No schema enforcement. When parsing JSON, you might want to validate that required keys exist.
- Deduplication addresses duplicates, not gaps. It eliminates an identical candidates however doesn’t restore lacking parameters.
đKey Takeaway: Prompts alone gainedât assure compliance. You want immediate + schema enforcement to make sure outputs persistently embody required parameters.
Immediate Compliance Points and Schema Options
Fashions can prioritize different components of the immediate or simplify outputs regardless of emphasis on required objects.
- Instance instruction: âReturn Values: ONLY output a JSON-style dictionary. Return string have to be now not than 1024 characters.â
- Noticed final result: len(res_text) = 1036 â responses exceeded the restrict.
- Lacking fields: Required objects generally didn’t seem, even when acknowledged clearly. Offering concrete output examples improved adherence.
- Sensible repair: Pair prompts with schema enforcement (e.g., validate required keys, size checks) and put upâera normalization to ensure construction.
Empty Candidate Errors in Gemini API
Now and again, I received this response:
>> ScoutAgent: Error throughout API name (Try 1/3): Invalid operation: The response.textual content fast accessor requires the response to comprise a sound Half, however none had been returned. The candidateâs finish_reason is 2.
That error message means the mannequin didnât really return any usable content material in its response, so when my code tried to entry response.textual content, there was no legitimate âHalfâ to learn. The important thing clue is finish_reason = 2, which in Googleâs API corresponds to a STOP or no content material generated situation (the mannequin terminated with out producing textual content).
Why it occurs:
- Empty candidate: The API name succeeded, however the mannequin produced no output.
- FinishReason = 2: Signifies the era stopped earlier than yielding a sound half.
- Fast accessor failure: Since response.textual content expects at the very least one legitimate textual content half, it throws an error when none exist.
The right way to deal with it:
- Verify finish_reason earlier than accessing response.textual content. Solely learn textual content if the candidate features a legitimate half.
- Add fallback logic: If no textual content is returned, log the end cause and retry or deal with gracefully.
- Schema enforcement: Validate that required fields exist within the response earlier than parsing.
đ Key Takeaway: This isnât a community error â itâs the mannequin signaling that it stopped with out producing textual content. Yow will discover the complete checklist of FinishReason values and steering on decoding them in Googleâs documentation: Generate Content API â FinishReason.
Intermittent API Connection Errors
Now and again, the Gemini API name failed with:
- Error: ConnectionError: (âConnection aborted.â, RemoteDisconnected(âDistant finish closed connection with out responseâ))
đ Key Takeaway: This can be a community error and occurred with out code adjustments, indicating transient community or service points. Add retries with exponential backoff, timeouts, and strong logging to seize context (request measurement, charge limits, finish_reason) and get better gracefully.
Agent Safety Concerns
Another factor to concentrate to, particularly if you’re utilizing Brokers for company use â safety is mission-critical!
â ď¸Present strict guardrails between Brokers and the LLM. Actively stop brokers from deleting crucial information, taking offâsubject actions, making unauthorized exterior API calls, and so on.
đ Key takeaway: Apply the Precept of Least Privilege
- Scope: Limit every agentâs permissions strictly to its assigned process.
- Isolation: Block filesystem writes, exterior calls, or offâsubject actions except explicitly licensed.
- Audit: Report all actions and require approvals for delicate operations.
âď¸Agentic AI Aggressive Panorama in direction of 2026
Mannequin Suppliers
This desk outlines how the agentic AI market is predicted to develop within the close to future. It highlights the principle corporations, rising opponents, and the traits that may form the house as we transfer in direction of 2026. Offered right here as a nonâexhaustive checklist of direct opponents to Gemini, the purpose is to present readers a transparent image of the strategic atmosphere by which agentic AI is evolving.
| Supplier | Core Focus | Strengths | Notes |
| Google Gemini API | Multimodal LLM service (textual content, imaginative and prescient, code, and so on.) | Excessiveâhigh quality generative outputs; Google Cloud integration; robust multimodal capabilities | Primarily a mannequin API, Gemini 3 explicitly designed to assist orchestration of agentic workflows |
| OpenAI GPT APIs | Textual content + code era | Extensively adopted; robust ecosystem; tremendousâtuning choices | Restricted multimodal assist in comparison with Gemini |
| Anthropic Claude | Securityâcentered textual content LLMs | Robust alignment and security options; lengthy context dealing with | Much less multimodal functionality |
| Mistral AI | Open and enterprise fashions | Versatile deployment; group pushed; customizable | Requires infrastructure setup |
| Meta LLaMA | Openâweight analysis fashions | Open supply; robust analysis backing; customizable | Wants infra and ops for manufacturing |
| Cohere | Enterprise NLP and embeddings | Enterprise options; embeddings; privateness choices | Narrower scope than common LLMs |
Agent Orchestration Frameworks
This desk examines the administration and orchestration elements of agentic AI. It highlights how totally different frameworks deal with coordination, reliability, and integration to allow scalable agent programs.
| Framework | Core Focus | Strengths | Notes |
| LangGraph | Graphâprimarily based orchestration | Fashions workflows as nodes/edges; robust reminiscence; multiâagent collaboration | Requires developer setup; orchestration solely |
| LangChain | Agent/workflow orchestration | Wealthy ecosystem; instrument integration; reminiscence/state dealing with | Can enhance token utilization and complexity |
| CrewAI | Positionâprimarily based crew orchestration | Position specialization; collaboration patterns; good for teamwork situations | Depends upon exterior LLMs |
| OpenAI Swarm | Light-weight multiâagent orchestration | Easy handoffs; ergonomic routines | Good for operating experiments |
| AutoGen (Microsoft) | Multiâagent framework | Analysis + manufacturing focus; extensible | Nonetheless evolving; requires Microsoft ecosystem |
| AutoGPT | Autonomous agent prototype | Quick prototyping; group pushed | Various manufacturing readiness |
â¨Conclusion and Future Work
This mission was my first experiment with Geminiâs agentic AI, adapting the Synthetic Bee Colony algorithm to an optimization process. Even on a small dataset, it demonstrated how LLMs can tackle beeâlike roles in a metaâheuristic course of, whereas additionally revealing each the promise and the sensible challenges of this strategy. Be at liberty to repeat and adapt the Google Colab notebook on your personal tasks.
Future Work
- Making use of the ABC metaâheuristic to bigger and extra numerous datasets.
- Extending the WebResearcher agent to mechanically assemble datasets from areaâparticular sources (e.g. Royal Botanic Gardens Kew â POWO), impressed by Sir Ronald Fisherâs pioneering work in statistical botany.
- Operating experiments with expanded swimming pools of employee threads and adjusting the variety of candidates per bee agent sort.
- Exploring semiâsupervised clustering, the place a small labeled dataset enhances a bigger unlabeled one.
- Evaluating outcomes from Googleâs Gemini API with outputs from different suppliersâ APIs.

