Stay informed and never miss an Core update!
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

In Part I, and Part II, we describe a proof of concept on the use of AI multiagents to automate the management of a type of service procedure for citizens of a public sector body.
In Part III, we describe some software architecture solutions that apply to the problem of intelligent agents and agentic systems, especially when it is necessary to take the next step of bringing them to productive environments.
In this post, we want to document how we are addressing the problem of “The Labyrinth of Context” in high-load systems. Using Akka Platform and an Agentic approach, we were able to convert manual investigations that took hours into highly efficient automated processes, reducing resolution time by more than 90% and ensuring unprecedented scalability. 🚀
When a user initiates a process requesting a service or a refund, they expect it to be resolved “now”. But the reality behind the scenes is very different.
The human analyst must manually navigate through legacy mainframes, systems with technical debt and disconnected platforms to gather evidence about the user, the process and their status. We call this cost “context-switching” The Labyrinth of Context.
Eliminate The Labyrinth of Context it's not just putting a chatbot on top of an API; it requires orchestrating complex workflows with traceability that allows us to know exactly what happened at each step.
The choice to use Akka Platform for this kind of product was deliberate, but not only for technical reasons: it was a strategic decision in terms of speed of development.
In less than One month, we managed to build a system Production-ready, with complex orchestration, full traceability and distributed resilience. This wasn't magic: it was the direct result of a mature ecosystem, a coherent architecture, and tools designed for these types of problems from day one.
In a world of traditional (stateless) microservices, managing long and complex flows quickly becomes a mess of shared databases, polling, manual compensations, and duplicate logic.
Akka allowed us to model each open procedure as its own workflow instance long-lasting (stable id = external id of the case), encapsulating state, logic and failure recovery in a single abstraction.
This means that the status of the procedure lives in memory next to the process, ready to process the next message, without having to search for it in the DB every time. If the system scales to millions of procedures, Akka distributes them (Sharding) across the cluster automatically.
Speed of Development as a Competitive Advantage
Akka Platform didn't just solve the technical problem: dramatically accelerated time-to-market. The combination of:
allowed us to go from concept to production in weeks, not quarters. In mission-critical systems, this difference is no less: it's a competitive advantage.
Event Sourcing: Full Traceability
Unlike saving only “the last state”, we use Event Sourcing. We persist every intention (Command) and each made (Event) that happens.
This allows us not only to maintain consistency if the server restarts, but also to reconstruct the history of any procedure: “What information did the system use to recommend this resolution 3 months ago?”. We can audit the events and understand exactly what evidence the process had at the time.
To integrate AI, we avoid the chaos of autonomous agents talking to each other without control. We designed a strict Orchestrator-Workers architecture.
The main orchestrator is a single workflow procedures (think of it as the control plane). Progress is modeled as an explicit chain of Workflow steps (each step is a reference to a method passed to TransitionTo / TheTransitionTo, with @StepName optional for logging and tooling). That's what makes control flow deterministic: the runtime always knows what step comes next. In practice, the main chain looks like this:
// Pseudocódigo — forma del workflow de Akka Platform
public Effect<StartResponse> start(StartCaseRequest req) {
return effects()
.updateState(ProcedureState.initial(req.caseId()))
.transitionTo(ProcedureWorkflow::loadCase)
.thenReply(StartResponse.accepted(req.caseId()));
}
@StepName("loadCase")
private StepEffect loadCase() {
return stepEffects()
.updateState(currentState().withDetails(caseApi.fetchCase(...)))
.thenTransitionTo(ProcedureWorkflow::classifyProcedureKind);
}
// loadCase → classify → validate → loadRules → gatherEvidence (pause)
// child pipeline completes → onEvidenceReady → decide → publish
LoadCase: It obtains the case data from the business API and materializes the state of the workflow.Kind ClassifyProcedures: Select the “form” of the procedure (product + subtype) so that the subsequent rules and tools are correct.Validate: It executes the preconditions for this type of procedure (identifiers, required fields, supported combinations).LoadRules: Resolves the applicable version of policy/regulatory text and attaches it to the context (often supported by versioned prompt content).GatherEvidence: Build a tool plan in stages, start a Workflow child of research, set up a wait timer and perform thenPause () until asynchronous completion.OnEvidenceReady: After the child workflow publishes the aggregated evidence, a deduplicated callback depulses the parent from decision-making.decides: Run the decision/recommendation stack (with guardrails).Publish: Write the result back to the case registration system.
Most of the work is in a secondary workflow dedicated to evidence. Instead of shooting all the integrations at once, go through a phased plan explicit (calls to ordered tools with conditions). This allows for dependency borders (“omit credit search if the interested party is unknown”) and failure semantics per tool.
Each tool runs on a ThreadPoolExecutor Bounded behind a small execution pool provider (future sent + timeouts per tool), so that a blocked external call does not paralyze the workflow actor itself (bulkhead between orchestration and I/O).
// Pseudocódigo — runner iterativo de herramientas dentro del workflow hijo
@StepName("executeNextTool")
private StepEffect executeNextTool() {
ResearchState st = currentState();
Plan.Stage stage = plan.stages().get(st.stageIdx());
ToolCall call = stage.tools().get(st.toolIdx());
ResearchContext ctx = /* case id + procedure kind + prior outputs */;
if (!shouldRun(call, ctx)) {
return stepEffects()
.updateState(st.advanceCursor())
.thenTransitionTo(ResearchWorkflow::executeNextTool);
}
try {
Map<String, Object> fragment = runToolWithRetries(
registry, call, ctx, stage.timeoutSec(), retry.max(), retry.backoffMs());
return stepEffects()
.updateState(st.merge(fragment).advanceCursor())
.thenTransitionTo(ResearchWorkflow::executeNextTool);
} catch (Exception ex) {
return onToolFailure(new ToolFailure(ex, call, st.stageIdx(), st.toolIdx()));
}
}
A huge challenge was that the evidence almost never comes in plain text. Users upload photos of invoices, PDFs, Excels, Word documents and XMLs. To do this, we built a family of 5 specialized intake agents, each one optimized for its format.
For PDFs, we use Apache PDFBox to extract native text when possible, with size limits and a fallback path that rasterizes pages and delegates to the image pipeline when the PDF is actually a scan. The extraction prompts are adjusted for fiscal and administrative documentation (identifiers, amounts, periods, lines of items). The call to the LLM goes through a thin HTTP client; the model ids remain in environment-oriented configuration so that we can change suppliers or sizes without redeploying business logic.
// Esbozo de estrategia
PDFExtractionResult textResult = extractTextFromPdf(pdfBytes);
String rawJson = textResult.hasEnoughText
? extractWithLlm(textResult)
: renderPagesToImagesAndRunVisionPipeline(...);
For photos of invoices, receipts and scans, we use a multimodal vision model (selected via config). The pipeline sends the bytes of the image to the inference endpoint and expects strict JSON back: readable text, quality indicators, orientation, and structured fields (totals, tax lines, dates, rows).
// Pseudocódigo — mismo stack HTTP que las rutas de texto, payload multimodal
String rawJson = llmClient.completeVisionJson(modelId, OCR_PROMPT, base64Image);
The files .xlsx are processed with Apache POI. We convert spreadsheets into Markdown tables, preserving spatial relationships (headers, totals, dates). This allows the LLM to understand the structure just as a human would, but as structured text; then the same configurable LLM stack interprets the table.
For files .doc and .docx, we use Apache POI to extract paragraphs, detect headings and preserve tables. The agent identifies the type of document (report, letter, contract) and extracts key data such as dates, parties, tax identifiers and legal references.
For XMLs, we implemented a smart strategy: small files (<50KB) are sent in full to the LLM, while large XMLs are structurally parsed to extract only the hierarchy and key fields. Specialized in electronic invoices, point of sale, type of voucher and schema validation.
Before we spend computation on extracting data, we have a “Doorman”. We implemented a universal validator that tells us, for example, “Is this a valid tax document or a selfie of the user?” If it is not relevant, it is rejected, saving processing and cleaning time. It works for both PDFs and images. A small document door apply multiple layers before trusting a parse: an explicit brand of the model (“is it in scope?”) , type/subtype blocklists for artifacts obviously outside the workplace, and a final heuristic step for real business signals (identifiers, money, dates, structured tables) in JSON.
// Pseudocódigo — misma idea para cada canal de ingesta
public ValidationResult validate(String jsonResponse) {
JsonNode root = objectMapper.readTree(jsonResponse);
if (root.has("document_in_scope")) {
if (!root.get("document_in_scope").asBoolean()) {
return rejected(root.path("rejection_reason").asText("Out of scope"));
}
return accepted();
}
if (root.has("error") && root.get("error").asBoolean()) {
return accepted(); // falla de extracción, no un rechazo de negocio
}
String docType = root.path("document_type").asText("").toLowerCase();
if (isObviousNonWorkType(docType)) {
return rejected("Unsupported document_type: " + docType);
}
if (!hasAnyBusinessSignals(root)) {
return rejected("No business signals detected");
}
return accepted();
}
We chose Ollama from day one with a clear vision: the ability to run powerful models on our own infrastructure.
Currently, due to customer infrastructure requirements, we consume these models via HTTP against an external proprietary Ollama server: “Ollama cloud”. However, the Agent's design remains strictly Stateless to facilitate the inevitable shift to local models in the future.
Our “North Star” 🌟
Although we currently operate via HTTP, the ultimate goal is to move towards proprietary servers with local models under our full control.
This will allow us to guarantee absolute data sovereignty (nothing comes out to the outside world) and eliminate dependence on external providers, ensuring that system intelligence is its own asset.
Here's a detail that was easy to implement thanks to Akka Platform and was worth its weight in gold: Prompts are not static configuration files.
Each Prompt is literally a persistent process (@PromptTemplate).
This allows us to do Hot-Reloading in production. If we detect that the agent is “hallucinating” with a new version, let's not redeploy the backend. We simply do a PUT in the prompt entity. The entity updates its status and the next recommendation generated (milliseconds later) is already using the new version. This is Zero-Downtime Config.
// 1. Resolver claves de prompt candidatas (tipo de trámite + variante + versión)
List<String> candidates = resolvePromptKeyCandidates(procedureKind, variant);
// 2. Pedirle al prompt entity persistido el texto más reciente (just-in-time)
String template = client.forEventSourcedEntity(candidates.get(0))
.method(PromptTemplate::get)
.invoke();
It is Infrastructure as Code brought to the level of AI content.
Transparency in AI is often an illusion, but here we force it architecturally. Before issuing a recommendation, we apply Output Guardrails.
We validate against a Allowlist of resolution codes allowed by regulations. If the agent tries to propose something invalid (due to an LLM hallucination), the system intercepts it at the validation layer of the process and forces a manual review, preventing the error from reaching the end user. 🛑
@Override
public WorkflowSettings settings() {
return WorkflowSettings.builder()
.defaultStepTimeout(ofSeconds(300))
// Si un paso falla, reintentarlo 3 veces antes de rendirse
.defaultStepRecovery(maxRetries(3).failoverTo(ProcedureWorkflow::interruptStep))
.build();
}
This “elegant degradation” approach is what maintains cluster stability even when external services fail.
Eliminating The Labyrinth of Context requires solid engineering behind the magic of AI.
The solution isn't to take people out of the equation. AI is a powerful tool, but it needs to be managed, controlled and validated. Our architecture puts technology in its place: eliminating repetitive work so that expert analysts can focus on what they need to do: solving complex cases.
© 2026 Peperina.io — Excellence in Engineering.