May 7, 2026

Agentic AI to modernize application business logic, Part IV: Research AI — Optimizing Procedure Resolution

Agentic AI to modernize application business logic, Part IV: Research AI — Optimizing Procedure Resolution

In Part I, and Part II, we describe a proof of concept on the use of AI multiagents to automate the management of a type of service procedure for citizens of a public sector body.

In Part III, we describe some software architecture solutions that apply to the problem of intelligent agents and agentic systems, especially when it is necessary to take the next step of bringing them to productive environments.

In this post, we want to document how we are addressing the problem of “The Labyrinth of Context” in high-load systems. Using Akka Platform and an Agentic approach, we were able to convert manual investigations that took hours into highly efficient automated processes, reducing resolution time by more than 90% and ensuring unprecedented scalability. 🚀

When a user initiates a process requesting a service or a refund, they expect it to be resolved “now”. But the reality behind the scenes is very different.

The human analyst must manually navigate through legacy mainframes, systems with technical debt and disconnected platforms to gather evidence about the user, the process and their status. We call this cost “context-switching” The Labyrinth of Context.

Eliminate The Labyrinth of Context it's not just putting a chatbot on top of an API; it requires orchestrating complex workflows with traceability that allows us to know exactly what happened at each step.

Why Akka Platform? 🏗️

The choice to use Akka Platform for this kind of product was deliberate, but not only for technical reasons: it was a strategic decision in terms of speed of development.

In less than One month, we managed to build a system Production-ready, with complex orchestration, full traceability and distributed resilience. This wasn't magic: it was the direct result of a mature ecosystem, a coherent architecture, and tools designed for these types of problems from day one.

In a world of traditional (stateless) microservices, managing long and complex flows quickly becomes a mess of shared databases, polling, manual compensations, and duplicate logic.

Akka allowed us to model each open procedure as its own workflow instance long-lasting (stable id = external id of the case), encapsulating state, logic and failure recovery in a single abstraction.

This means that the status of the procedure lives in memory next to the process, ready to process the next message, without having to search for it in the DB every time. If the system scales to millions of procedures, Akka distributes them (Sharding) across the cluster automatically.


Speed of Development as a Competitive Advantage

Akka Platform didn't just solve the technical problem: dramatically accelerated time-to-market. The combination of:
  • Coherent mental model (Actor + Workflow + Event Sourcing)
  • “Batteries-included” distributed infrastructure
  • Clear semantics for failures, retries and recovery
allowed us to go from concept to production in weeks, not quarters. In mission-critical systems, this difference is no less: it's a competitive advantage.

Event Sourcing: Full Traceability

Unlike saving only “the last state”, we use Event Sourcing. We persist every intention (Command) and each made (Event) that happens.
This allows us not only to maintain consistency if the server restarts, but also to reconstruct the history of any procedure: “What information did the system use to recommend this resolution 3 months ago?”. We can audit the events and understand exactly what evidence the process had at the time.

Deep Dive: Orchestrator-Workers and Flows 🎼

To integrate AI, we avoid the chaos of autonomous agents talking to each other without control. We designed a strict Orchestrator-Workers architecture.

1. ProcedureWorkflow: The Deterministic Brain

The main orchestrator is a single workflow procedures (think of it as the control plane). Progress is modeled as an explicit chain of Workflow steps (each step is a reference to a method passed to TransitionTo / TheTransitionTo, with @StepName optional for logging and tooling). That's what makes control flow deterministic: the runtime always knows what step comes next. In practice, the main chain looks like this:

// Pseudocódigo — forma del workflow de Akka Platform
public Effect<StartResponse> start(StartCaseRequest req) {
    return effects()
            .updateState(ProcedureState.initial(req.caseId()))
            .transitionTo(ProcedureWorkflow::loadCase)
            .thenReply(StartResponse.accepted(req.caseId()));
}

@StepName("loadCase")
private StepEffect loadCase() {
    return stepEffects()
            .updateState(currentState().withDetails(caseApi.fetchCase(...)))
            .thenTransitionTo(ProcedureWorkflow::classifyProcedureKind);
}

// loadCase → classify → validate → loadRules → gatherEvidence (pause)
// child pipeline completes → onEvidenceReady → decide → publish

  • step LoadCase: It obtains the case data from the business API and materializes the state of the workflow.
  • step Kind ClassifyProcedures: Select the “form” of the procedure (product + subtype) so that the subsequent rules and tools are correct.
  • step Validate: It executes the preconditions for this type of procedure (identifiers, required fields, supported combinations).
  • step LoadRules: Resolves the applicable version of policy/regulatory text and attaches it to the context (often supported by versioned prompt content).
  • step GatherEvidence: Build a tool plan in stages, start a Workflow child of research, set up a wait timer and perform thenPause () until asynchronous completion.
  • Commando OnEvidenceReady: After the child workflow publishes the aggregated evidence, a deduplicated callback depulses the parent from decision-making.
  • step decides: Run the decision/recommendation stack (with guardrails).
  • step Publish: Write the result back to the case registration system.

2. Research Pipeline: Resilient Execution

Most of the work is in a secondary workflow dedicated to evidence. Instead of shooting all the integrations at once, go through a phased plan explicit (calls to ordered tools with conditions). This allows for dependency borders (“omit credit search if the interested party is unknown”) and failure semantics per tool.

Each tool runs on a ThreadPoolExecutor Bounded behind a small execution pool provider (future sent + timeouts per tool), so that a blocked external call does not paralyze the workflow actor itself (bulkhead between orchestration and I/O).

// Pseudocódigo — runner iterativo de herramientas dentro del workflow hijo
@StepName("executeNextTool")
private StepEffect executeNextTool() {
    ResearchState st = currentState();
    Plan.Stage stage = plan.stages().get(st.stageIdx());
    ToolCall call = stage.tools().get(st.toolIdx());
    ResearchContext ctx = /* case id + procedure kind + prior outputs */;

    if (!shouldRun(call, ctx)) {
        return stepEffects()
                .updateState(st.advanceCursor())
                .thenTransitionTo(ResearchWorkflow::executeNextTool);
    }

    try {
        Map<String, Object> fragment = runToolWithRetries(
                registry, call, ctx, stage.timeoutSec(), retry.max(), retry.backoffMs());
        return stepEffects()
                .updateState(st.merge(fragment).advanceCursor())
                .thenTransitionTo(ResearchWorkflow::executeNextTool);
    } catch (Exception ex) {
        return onToolFailure(new ToolFailure(ex, call, st.stageIdx(), st.toolIdx()));
    }
}

Eyes for the Agent: Multimodal Document Intake 👁️

A huge challenge was that the evidence almost never comes in plain text. Users upload photos of invoices, PDFs, Excels, Word documents and XMLs. To do this, we built a family of 5 specialized intake agents, each one optimized for its format.

PDF ingestion: first text, view when needed 📄

For PDFs, we use Apache PDFBox to extract native text when possible, with size limits and a fallback path that rasterizes pages and delegates to the image pipeline when the PDF is actually a scan. The extraction prompts are adjusted for fiscal and administrative documentation (identifiers, amounts, periods, lines of items). The call to the LLM goes through a thin HTTP client; the model ids remain in environment-oriented configuration so that we can change suppliers or sizes without redeploying business logic.

// Esbozo de estrategia
PDFExtractionResult textResult = extractTextFromPdf(pdfBytes);
String rawJson = textResult.hasEnoughText
    ? extractWithLlm(textResult)
    : renderPagesToImagesAndRunVisionPipeline(...);

Image ingestion: Multimodal OCR 🖼️

For photos of invoices, receipts and scans, we use a multimodal vision model (selected via config). The pipeline sends the bytes of the image to the inference endpoint and expects strict JSON back: readable text, quality indicators, orientation, and structured fields (totals, tax lines, dates, rows).

// Pseudocódigo — mismo stack HTTP que las rutas de texto, payload multimodal
String rawJson = llmClient.completeVisionJson(modelId, OCR_PROMPT, base64Image);

Ingesting spreadsheets 📊

The files .xlsx are processed with Apache POI. We convert spreadsheets into Markdown tables, preserving spatial relationships (headers, totals, dates). This allows the LLM to understand the structure just as a human would, but as structured text; then the same configurable LLM stack interprets the table.

Word 📝 documents

For files .doc and .docx, we use Apache POI to extract paragraphs, detect headings and preserve tables. The agent identifies the type of document (report, letter, contract) and extracts key data such as dates, parties, tax identifiers and legal references.

XML and structured payloads 🏷️

For XMLs, we implemented a smart strategy: small files (<50KB) are sent in full to the LLM, while large XMLs are structurally parsed to extract only the hierarchy and key fields. Specialized in electronic invoices, point of sale, type of voucher and schema validation.

Tax Validator 🛡️ Guardrail

Before we spend computation on extracting data, we have a “Doorman”. We implemented a universal validator that tells us, for example, “Is this a valid tax document or a selfie of the user?” If it is not relevant, it is rejected, saving processing and cleaning time. It works for both PDFs and images. A small document door apply multiple layers before trusting a parse: an explicit brand of the model (“is it in scope?”) , type/subtype blocklists for artifacts obviously outside the workplace, and a final heuristic step for real business signals (identifiers, money, dates, structured tables) in JSON.

// Pseudocódigo — misma idea para cada canal de ingesta
public ValidationResult validate(String jsonResponse) {
    JsonNode root = objectMapper.readTree(jsonResponse);

    if (root.has("document_in_scope")) {
        if (!root.get("document_in_scope").asBoolean()) {
            return rejected(root.path("rejection_reason").asText("Out of scope"));
        }
        return accepted();
    }

    if (root.has("error") && root.get("error").asBoolean()) {
        return accepted(); // falla de extracción, no un rechazo de negocio
    }

    String docType = root.path("document_type").asText("").toLowerCase();
    if (isObviousNonWorkType(docType)) {
        return rejected("Unsupported document_type: " + docType);
    }

    if (!hasAnyBusinessSignals(root)) {
        return rejected("No business signals detected");
    }

    return accepted();
}

The Cognitive Layer: Why Does It Call? 🦙

We chose Ollama from day one with a clear vision: the ability to run powerful models on our own infrastructure.

Currently, due to customer infrastructure requirements, we consume these models via HTTP against an external proprietary Ollama server: “Ollama cloud”. However, the Agent's design remains strictly Stateless to facilitate the inevitable shift to local models in the future.

Our “North Star” 🌟
Although we currently operate via HTTP, the ultimate goal is to move towards proprietary servers with local models under our full control.
This will allow us to guarantee absolute data sovereignty (nothing comes out to the outside world) and eliminate dependence on external providers, ensuring that system intelligence is its own asset.

Prompt Hot-Loading: Adjusting the “Brain” to Hot 🧠

Here's a detail that was easy to implement thanks to Akka Platform and was worth its weight in gold: Prompts are not static configuration files.

Each Prompt is literally a persistent process (@PromptTemplate).

This allows us to do Hot-Reloading in production. If we detect that the agent is “hallucinating” with a new version, let's not redeploy the backend. We simply do a PUT in the prompt entity. The entity updates its status and the next recommendation generated (milliseconds later) is already using the new version. This is Zero-Downtime Config.

// 1. Resolver claves de prompt candidatas (tipo de trámite + variante + versión)
List<String> candidates = resolvePromptKeyCandidates(procedureKind, variant);

// 2. Pedirle al prompt entity persistido el texto más reciente (just-in-time)
String template = client.forEventSourcedEntity(candidates.get(0))
        .method(PromptTemplate::get)
        .invoke();

It is Infrastructure as Code brought to the level of AI content.

Auditability and Guardrails 👮 ‍ ♂️

Transparency in AI is often an illusion, but here we force it architecturally. Before issuing a recommendation, we apply Output Guardrails.

We validate against a Allowlist of resolution codes allowed by regulations. If the agent tries to propose something invalid (due to an LLM hallucination), the system intercepts it at the validation layer of the process and forces a manual review, preventing the error from reaching the end user. 🛑

Resilience 💥

  • Recovery per step: We configure policies MaxRetries (3) with backoff for each step. If an API from one of the external systems fails temporarily, the workflow retries just that step without losing context.
  • Research wait timer: While the parent workflow is paused in a typical lifecycle state WAITING_FOR_RESEARCH, a single timer safeguards the transfer to the child pipeline. If activated, the parent records a timeout in the aggregate but can still move forward to Decide so that operators receive a recommendation with the evidence that arrived (possibly none). Separately, the optional tools in the child workflow are downgraded to warnings instead of failing the entire research execution.

@Override
public WorkflowSettings settings() {
    return WorkflowSettings.builder()
            .defaultStepTimeout(ofSeconds(300))
            // Si un paso falla, reintentarlo 3 veces antes de rendirse
            .defaultStepRecovery(maxRetries(3).failoverTo(ProcedureWorkflow::interruptStep))
            .build();
}

This “elegant degradation” approach is what maintains cluster stability even when external services fail.

Conclusions 🤹🏼

Eliminating The Labyrinth of Context requires solid engineering behind the magic of AI.

  • Akka Platform provides the necessary semantics to reliably handle distributed state, And it also allowed us to build a complete system in weeks.
  • Event Sourcing transforms the database from a simple warehouse into an auditable and reproducible source of truth.
  • Separate the flow into Orchestrators (Control) And Workers (Execution) allows us to scale the heavy parts (I/O, calls to the LLM) regardless of the central business logic.

The solution isn't to take people out of the equation. AI is a powerful tool, but it needs to be managed, controlled and validated. Our architecture puts technology in its place: eliminating repetitive work so that expert analysts can focus on what they need to do: solving complex cases.

Tools mentioned 🛠️

© 2026 Peperina.io — Excellence in Engineering.

Francisco Perrotta
Software Engineer and Backend Architect at Peperina Software with experience in distributed systems