Where the human belongs: AI-augmented cyber targeting and legal judgement

This is post #18 in my IHL series. It picks up where my previous post on the targeting cycle in cyber operations left off.

Estimated reading time: 26 minutes

I. The question post #17 left open

In my previous post, I argued that the targeting cycle and the reasonable-commander standard hold under cyber-domain conditions, and that what strains is the foreseeability evidence the targeting staff — the multidisciplinary staff element, drawn from intelligence, operations, fire support, plans and legal staff, that develops, validates and recommends targets to the commander within the joint targeting cycle (hereinafter the “targeting staff”) — has to put on the table at Phases 2 and 3. That post closed by flagging an unaddressed question: how does AI-augmented cyber targeting change the picture? Specifically, where in the targeting cycle is a human in the loop needed, where is one not, and is “human in the loop” even the right concept once AI decision support systems (AI DSS) sit between the commander and the foreseeability evidence?

This post takes that question up directly. It does so against the background of two earlier posts in this series — the data-as-a military objective post on the definitional question targeting decisions are downstream of, and the AWS Part 3 post on lifecycle distribution of human control over autonomous weapons. AWS Part 3 covered the conceptual ground on “meaningful human control”, the GGE rolling text shift to “context-appropriate human judgement and control”, and why a real-time human-in-the-loop confirmation requirement does not work as a universal standard for autonomous weapons. I will not re-state that analysis here. I will treat its conclusions as the premise the present post builds on, and I will cross-reference where the analytical ground genuinely overlaps.

The cyber AI-DSS question is a different question, and it deserves its own treatment. AWS are the systems that find and engage a target the human commander has already selected through the targeting cycle. AI DSS do something else: they structure the analytical work the targeting staff does at every stage of that cycle. That structuring runs before the commander decides — by surfacing options, prioritising alternatives and assembling the foreseeability evidence base. It runs around the decision — by validating parameters and flagging where evidence is thin. And it runs after the decision — by tracking actual effects against predicted ones at Phase 6 assessment, and by structuring the record of what the system surfaced and what the commander acted on, which is the record the reasonable-commander standard’s ex ante test relies on when it is later reconstructed. Operational AI DSS that perform such assessment functions are increasingly fielded by NATO and U.S. forces, though cyber-specific implementations vary in maturity by function, a point Section V returns to. There is nothing in the AI-DSS output that the human needs to “override” because the human is making the decision throughout. The question is what the human decides on the basis of, how the AI DSS shapes the evidence and the option space, and where in the targeting cycle the human’s judgement does its real work.

In AI-augmented cyber targeting, the place where human judgement is exercised does not concentrate at the moment of execution. It distributes across the targeting cycle — into planning-stage governance, parameter-setting for the AI DSS, validation of AI-DSS outputs at Phase 2 target development and Phase 3 capabilities analysis, and operational monitoring across Phases 4 and 5. This is not a retreat from human judgement. It is the relocation of human judgement to the points in the cycle where it can actually do the work the reasonable-commander standard requires. In my view, the cyber-DSS context is where this relocation matters most and is least understood in current practitioner discourse — and that is what makes the question worth a treatment of its own.

This post provides that treatment as follows. Section II marks where AWS Part 3’s analysis stops and the cyber-DSS question begins. Section III maps the four nested control loops the December 2025 NATO TR-HFM-330 report identified for AWS onto the cyber targeting cycle, and shows where the mapping holds and where it breaks. Section IV surveys what NATO, the United States, France and Germany have said about AI in military decision-making, filtered for what they say about decision support and the targeting cycle specifically. Section V works through the six phases of the NATO targeting cycle and asks, at each phase, where human judgement anchors and what the AI DSS contributes. Section VI engages the critics — Dorsey, Bo, Geairon, the SIPRI bias report — on the strongest version of their argument before providing a considered response. Section VII describes what the targeting staff, equipped with AI DSS and operating under the architectural reframing, actually looks like in practice across the perspectives of three audiences. The Conclusion ties the reframing back to the reasonable-commander standard and the ex ante test that gives it legal weight, and flags where this series goes next.

II. From AWS lifecycle to the cyber targeting cycle — what changes

The conceptual ground on human control over AI in the military domain has shifted in the last twelve months, and AWS Part 3 covered the substance of that shift. Three points from that post matter here. I will recap them below.

First, the GGE rolling text of 12 May 2025 replaced “meaningful human control” with “context-appropriate human judgement and control” (§ II.5). The shift acknowledges that what counts as adequate human control depends on the system, the operational environment, the type of target, and the phase of the engagement cycle. Second, the 2023 update to U.S. DoD Directive 3000.09 requires designers to ensure that autonomous and semi-autonomous weapons allow commanders and operators to exercise “appropriate levels of human judgment over the use of force” (§ 1.2.a) — language that converges with the GGE formulation on context-sensitivity and lifecycle distribution. Third, NATO’s December 2025 TR-HFM-330 report on Human Systems Integration for Meaningful Human Control treats human control as a property that emerges through interactions across the full lifecycle of a weapon system, not as a binary present-or-absent variable at the moment of engagement (TR-HFM-330 RTG, at PDF pp. 39, 40 and 54, 55).

For AI-augmented cyber targeting, that conceptual ground travels only partway. Two of the three findings transpose directly: the context-sensitivity move and the rejection of the binary present-or-absent variable both apply with full force to the cyber-DSS context. The third — the lifecycle architecture — does not transpose cleanly, because the cyber AI-DSS question is not primarily a weapon-system lifecycle question. It is a targeting-cycle question. AWS Part 3 analyses how human control distributes across the design, acquisition, training, deployment and disposal arc of an autonomous weapon system. The present post analyses how human judgement distributes across the planning, execution and assessment arc of an operation — a different institutional architecture, governed by different doctrinal sources (AJP-3.9(B) and AJP-3.20 rather than weapon-system-development policy), and asking different questions of the human in each phase.

The four nested control loops the NATO TR-HFM-330 report identifies for AWS — operator, mission, design, governance — give us a useful starting framework for that cyber-targeting-cycle analysis, but the mapping is non-trivial. Section III. of this post works through the mapping. The remainder of this post then asks, phase by phase, what the cyber targeting cycle requires of human judgement when AI DSS sit between the commander and the foreseeability evidence.

III. The four control loops, mapped onto the cyber targeting cycle

The NATO TR-HFM-330 report is the most carefully developed practitioner treatment of human control over AI-based military systems published to date. Its Socio-Technical Feedback Loop model (SOTEF) identifies four nested loops through which human control is exercised: an operator loop in which an operator governs the AI system through a control unit, a mission loop overseen by mission commanders who determine deployment strategy and locations, a design loop managed by the engineers who design the system and train its machine learning models, and a governance loop led by policymakers who develop laws, protocols, and rules of engagement (TR-HFM-330, at PDF pp. 39, 40 and 54, 55). The model was developed for autonomous weapon systems. The question is whether and how it travels into the cyber-DSS context.

The transposition is partial. Three of the four loops translate; one breaks down because it has no clean cyber analogue.

The governance loop

The governance loop transposes directly. Policymakers, doctrine writers, and rules-of-engagement authorities continue to govern AI DSS in cyber operations through the same instruments they apply to other military AI capabilities — NATO’s six Principles of Responsible Use (PRUs), the U.S. Political Declaration on Responsible Military Use of AI and Autonomy, JSP 936 in the United Kingdom, and equivalent national frameworks. In the cyber targeting cycle, the governance loop intersects most directly with Phase 1, where the commander’s intent and targeting guidance set the parameters within which the AI DSS will operate.

The design loop

The design loop transposes, but with a twist. In TR-HFM-330’s framing, the design loop covers the engineers who design the AWS itself and train its models. For AI DSS in cyber targeting, the design loop has two distinct strata. The first is the design of the AI DSS as a tool: the model, the training data, the explainability features, the user interface. The second is the design of the targeting decision: the foreseeability evidence base assembled at Phase 2, the capabilities analysis at Phase 3, the parameter set the AI DSS is asked to evaluate against. The first stratum sits with the engineering team. The second sits with the targeting staff — and is where the architectural reframing in this post does its real work. Santoro’s argument that high-stakes decisions are made at the moment of design before they are made at the moment of execution applies to both strata, and applies to the targeting-staff stratum even when the AI DSS itself was designed by an external vendor years earlier. CSET’s three-pillar framework (Scope, Data, Human-machine interaction) gives the targeting-staff-stratum design work a structured form: the targeting staff must define the scope of the question the AI DSS is being asked, validate the data it draws on, and calibrate the human-machine interaction at each phase.

The mission loop

The mission loop transposes, fragmented across phases. TR-HFM-330’s mission loop has the mission commander determining where and when to deploy the AI system. In the cyber targeting cycle, the mission loop fragments. Phase 4 (force planning and assignment) and Phase 5 (mission planning and force execution) are mission-loop work in TR-HFM-330’s sense, but they are also where the cycle’s documentation discipline takes hold — the decisions, parameters, and AI-DSS outputs that the commander acted on are recorded for later reconstruction. Adler’s analysis of AI integration into the U.S. Army’s Military Decision-Making Process (MDMP) shows what this looks like in practice for staff work: AI augments the analytical capacity of the staff at specific MDMP phases — mission analysis, course-of-action development, course-of-action analysis — while human judgement is preserved at the core of decision-making (Adler, Modernizing Military Decision-Making, Military Review Online Exclusive, August 2025, at pp. 3–4 and at the article’s conclusion). For the documentation discipline specifically, the cyber-targeting analogue runs through Tallinn 2.0 Rule 115, which places the obligation to verify that the target is a lawful military objective on those who plan or decide upon the cyber attack and which expressly contemplates that planners “should, where feasible, have technical experts available” to assist them in determining whether appropriate precautionary measures have been taken (Tallinn 2.0, Rule 115 paras 2–5 and Rule 114 para 6).

The operator loop

The operator loop does not transpose cleanly. This is where the cyber-DSS context departs from the AWS context most sharply. TR-HFM-330’s operator loop has a human operator governing the AI system at the moment of engagement — the AWS finds and engages a target the human commander has already selected through the targeting cycle, and the operator’s control unit constrains, permits, or interrupts that find-and-engage function. AI DSS do not engage; they recommend, structure, and document. There is no real-time operator-loop function in cyber AI-DSS use because there is no engagement decision the AI DSS is making for the human to govern. In my view, the absence of a clean operator-loop analogue is not a gap in the cyber-DSS architecture — it is the signature of a different kind of system. The work the operator loop does for AWS (real-time human governance of an autonomous find-and-engage function) is done in cyber AI-DSS use by a different mechanism: continuous validation by the targeting staff of the evidence base the AI DSS produces, across the cycle’s planning phases, rather than at the single moment of engagement.

The practical consequence is that three of the four TR-HFM-330 loops give us a usable framework for the cyber targeting cycle, but the fourth — the operator loop — is replaced by something the AWS framework does not have a category for: a distributed validation function exercised by the targeting staff across Phases 2, 3, 4 and 5. Section V .of this post will work through what that validation function looks like in each phase. Before turning to that, Section IV. briefly surveys the state positions on AI in military decision-making, filtered for what they say about decision support and the targeting cycle specifically.

IV. State positions on AI in military decision-making

State practice on AI in military decision-making is more developed than its public visibility suggests. Five sources matter for this post: NATO’s Revised AI Strategy (10 July 2024), the U.S. Political Declaration on Responsible Military Use of AI and Autonomy (November 2023), the French Defence AI Strategy and AMIAD initiative (March–May 2024), the BMVg’s published position on responsible AI use in the Bundeswehr (November 2025), and the Paris Declaration on Maintaining Human Control in AI-enabled Weapon Systems (February 2025). Read together, these sources show a converged direction of travel on the design-stage and lifecycle commitments — and a much patchier picture on operational application to specific friction points.

NATO

The Revised AI Strategy reaffirms the six Principles of Responsible Use (PRUs) endorsed in 2021 — Lawfulness, Responsibility and Accountability, Explainability and Traceability, Reliability, Governability, and Bias Mitigation (at § 2). Three of these — Explainability and Traceability, Reliability, and Governability — are design-stage commitments by their structure: a system cannot be made explainable, traceable, reliable, and governable at the moment of execution if it was not designed for these properties from the outset. The Revised Strategy expressly flags “accountability in human-machine teaming” as an issue meriting NATO’s attention (at § 4). For AI-augmented cyber targeting, the PRU framework therefore commits NATO Allies to design-stage governance of AI DSS and to traceability across the targeting cycle — even though the strategy does not say so in cycle-specific terms.

The United States

The Political Declaration is the most operationally-focused state instrument on AI in military decision-making to date, and as of the March 2024 plenary it is endorsed by 54 states. Its scope is expressly broad: in her remarks at the launch event on 9 November 2023, Under Secretary Jenkins described the Declaration as covering “the full range of uses of AI and autonomy in the military domain” — running from logistics and personnel management through intelligence collection and decision-making processes.

AI DSS sit squarely within that scope. The Political Declaration’s substantive measures matter for the architectural reframing this post argues for. The preamble locates military AI use within a responsible human chain of command and control. Measure E requires personnel to exercise “appropriate care” in development, deployment and use; Measure G requires that personnel be trained to make “appropriate context-informed judgments on the use of those systems”; Measure H requires that AI capabilities have “explicit, well-defined uses.” These are not real-time-override commitments. They are commitments to the design, training, and lifecycle architecture within which human judgement is exercised.

France

Defence Minister Sébastien Lecornu presented the French Defence AI Strategy on 8 March 2024 at the École Polytechnique site in Palaiseau and announced the creation of the Agence ministérielle pour l’intelligence artificielle de défense (AMIAD) on the same day. AMIAD’s mission, as described by the Ministry, is to enable France to master defence AI sovereignly so as not to depend on other powers, and to professionalise the use of AI within the armed forces. AMIAD’s director Bertrand Rondepierre frames the agency’s mission bluntly: it is “about equipping ourselves to win the war.” A three-year framework agreement between the Ministry of the Armed Forces and Mistral AI was announced in January 2026 and is overseen by AMIAD; the agreement extends across the armed services and affiliated public bodies, with AI tools to be hosted on French infrastructure. The reported use cases at the framework’s announcement remain at a general level: intelligence analysis, document processing, and other staff-support functions. France has not published a doctrine document on how AI DSS interact with Article 57 AP I obligations, but the operational direction of travel is clear.

The Paris Declaration

Endorsed by 27 states in February 2025, the Paris Declaration on Maintaining Human Control in AI enabled Weapon Systems is an AWS-focused instrument rather than a DSS-focused one, and the absence of the United States, the United Kingdom, China, Australia, and Israel from the signatories limits its weight as a convergence point. For our purposes it matters mainly because Germany, France, and most other major European Allies signed it — confirming that the European position on responsibility-as-non-transferable, which the BMVg piece anchors, has multilateral backing.

Read together, these five sources support the architectural reframing this post is making. Three commitments converge across the named state positions: that AI DSS use is bounded by lifecycle governance (NATO PRUs, Political Declaration Measure I on testing across the entire life-cycle, BMVg internal Konzeption); that responsibility for AI-enabled decisions remains with humans across the chain of command (Political Declaration preamble, Paris Declaration, BMVg position); and that personnel training and design-stage explainability are load-bearing for legal compliance (Political Declaration Measures G and H, NATO PRUs on Explainability and Traceability). What the state record does not yet say is how these commitments translate into operational practice at specific points in the cyber targeting cycle — which is the gap Section V begins to close.

V. Phase by phase — where human judgement anchors the cyber targeting cycle

This section asks, at each phase of the NATO targeting cycle, three questions: where does human judgement sit, what does the AI DSS contribute, and what is the residual risk?

Phase 1 — Commander’s intent

Phase 1 of the NATO targeting cycle is parameter-setting work. AJP-3.9(B) describes it as the phase in which the commander’s intent, objectives, and targeting guidance identify NAC-approved target sets and desired effects (AJP-3.9(B), at p. 1-15). The targeting staff does not yet engage targets at this phase; it translates strategic guidance into operational targeting parameters.

AI DSS contribute nothing here directly. The phase is judgement-defining, not data-processing. In operational terms, however, Phase 1 is where the governance loop bites hardest on AI-DSS use. The parameters the commander sets here are the same parameters within which AI DSS deployed later will be authorised to operate. Get Phase 1 wrong, and every downstream AI-DSS output inherits the error.

Phase 2 — Target development

Phase 2 builds the foreseeability evidence base. AJP-3.9(B) anchors the targeting staff in a target-system approach and requires consultation between targeteers and legal advisers at validation, with the applicable legal framework — especially IHL/LOAC — as the test (AJP-3.9(B), at pp. 1-15 to 1-16). The NATO ACO Handbook reinforces the point: indirect civilian effects that cascade beyond the immediate strike surface during target development and must inform both nomination and prioritisation (NATO ACO Protection of Civilians Handbook, at p. 31).

For cyber targeting, AI DSS can do substantial analytical work at this phase. They are well-positioned to generate the three artefacts post #17 identified: the network-topology map, the dependency graph, and the cascading-effects model. In operational terms, the human anchor sits at validation. The targeting staff decides what counts as a sufficiently foreseeable indirect effect — a judgement an AI DSS structures but cannot make.

Phase 3 — Capabilities analysis

Phase 3 asks whether the joint force can lawfully engage the target and, if so, how to weaponeer and mitigate undesirable effects identified at Phase 2 (AJP-3.9(B), at p. 1-17). For cyber, this is the technical-prediction phase: will an exploit chain reach the target, what behaviour will it produce, what cascading consequences follow?

CSET’s Scope pillar treats prediction tools as warranting added scrutiny unless their predictions rest on “well-understood physical laws” and are anchored in directly applicable data, with human-behaviour predictions singled out as the hardest case (CSET, AI for Military Decision-Making, at pp. 17, 19). The practitioner reading is that cyber capability prediction is not a human-behaviour prediction. It is a prediction about how code behaves against a specific software configuration — closer to the physical-law-based predictions CSET treats favourably than to the human-behaviour case it warns against. CSET’s Data pillar flags a different concern: that adversaries actively conceal valuable and often rare capabilities, leaving the AI DSS to work from sparse training data on the very systems it is meant to evaluate (CSET, at p. 20). The targeting staff’s role is to calibrate the uncertainty bands the AI DSS reports.

Phase 4 — Force planning and assignment

Phase 4 is the assignment decision: the targeting staff prioritises validated targets against available capabilities, and the commander selects among AI-DSS-supported options. Pouw and Pijpers note that human oversight is most robust during planning and weakens at target selection, in execution, and in monitoring — precisely where Article 57(2) AP I locates the precautionary obligations on “those who plan and decide upon an attack.”

In my view, Phase 4 is where AI DSS earn their keep most clearly. The human selects among options the system has prioritised against criteria the commander set at Phase 1. The Article 57(2) precautions discipline travels into the selection because those criteria encode the precautionary obligations.

Phase 5 — Mission planning and force execution

Phase 5 is where dynamic targeting happens and where, at machine speed, the conventional human-in-the-loop intuition breaks down. Three sources, read together, support the same observation from different angles. Santoro argues that high-stakes oversight is most fragile at the point of execution and most consequential at the point of design. Operators, in this framing, shift from real-time overriders to monitors of systemic drift. Adler frames brittleness, unexplainability, and bias as failure modes that must be designed out, validated against, and continuously monitored — upstream of execution. And Biller’s third prediction for 2026 forecasts that AI-tempo cyber operations will displace legal review at the employment stage. In its place comes ex ante governance — system design choices, model training, and pre-deployment testing.

In my view, the question worth asking at Phase 5 is whether the system was designed, parametrised, and monitored such that the commander’s ex ante judgement travels with the operation through to its effects. Whether a human is in the loop at the moment of execution is the wrong test.

Phase 6 — Assessment and the feedback loop

Phase 6 closes the cycle. The targeting staff assesses actual effects against predicted effects, and feeds what it learns back into Phase 2 of the next iteration. AI DSS contribute by observing reverberating effects at scale — across systems, networks, and time. They structure the comparison against the cascading-effects model the staff used at Phase 2. The human role is to interpret what the AI DSS surfaces and to decide what it means for the next targeting iteration.

CHMR-AP Objective 6 is the doctrinal anchor here. It requires standardised reporting and a unified platform for managing civilian-harm data, so that lessons travel across operations and inform future ones (CHMR-AP, Objective 6). In my view, the AI DSS sharpens precisely the learning loop the reasonable-commander standard depends on.

VI. Engaging the critics

The critical position

The strongest critical position on AI DSS in the targeting cycle deserves a careful hearing. Dorsey and Bo argue that AI DSS are not benign augmentation. They shape core targeting functions in ways that change what proportionality and precautions assessment look like in practice, and AI DSS integration into the targeting cycle drives a shift toward quantification at the moment of decision (Dorsey and Bo, 107 International Law Studies (2025), at p. 27). Geairon extends this concern, arguing that AI DSS recalibrate the factual basis of legal judgement by structuring what decision-makers can anticipate, compare and justify ex ante, while introducing new risks linked to data gaps, opacity and over-reliance on technical outputs. The SIPRI bias report synthesises the bias-and-compliance critique: demographic bias in AI DSS can lead to misidentification of civilians as targets and failure to identify protected persons and objects, undermining compliance with the principles of distinction and proportionality (SIPRI, Bias in Military Artificial Intelligence, August 2025, by Laura Bruun and Marta Bo at pp. v–vi).

Two moves, two responses

Two distinct moves sit inside this critical position, and they deserve different responses. The empirical move — that AI DSS already operate as systems where human personnel serve as “rubber stamps” for machine output — rests, when traced through Dorsey and Bo at p. 28, principally on +972 Magazine reporting on Israel’s Lavender system that relies on unnamed IDF sources. That reporting cannot be independently verified, and the practitioner reading is that one set of contested journalistic claims about one operational context should not be generalised into a structural diagnosis of AI DSS use across NATO and U.S. doctrine, which is what this post addresses.

The analytical move is stronger and deserves an honest answer. In my view, Dorsey and Bo are right that AI DSS can promote a reliance on quantification at the moment of decision. The architectural reframing is precisely the response: the answer is not to remove the AI DSS but to relocate the judgement to design and operational monitoring, where quantification cannot substitute for the work of setting parameters, calibrating uncertainty bands, and validating outputs across the cycle.

Empirical evidence on operational populations

The empirical literature on operational populations is also more encouraging than the critical literature acknowledges. Horowitz and Kahn note that “much of that research is limited to the healthcare and aviation sectors”, and that research on algorithm aversion shows humans grow more cautious about trusting algorithms as decision stakes rise (Horowitz and Kahn, Bending the Automation Bias Curve: A Study of Human and AI-Based Decision Making in National Security Contexts (2024), at p. 11). Three findings, read together, sharpen this point. First, Kahn, Horowitz and Samotin provide the first experimental evidence on automation bias in a military population, finding that U.S. Military Academy cadets are better-calibrated and less susceptible to automation bias than a demographically matched general public (Kahn, Horowitz, Samotin, What is Human in Judgment? Testing Automation Bias and Algorithm Aversion Among United States Military Academy Cadets at p. 30). Second, Lopez and colleagues find qualitatively that U.S. Air Force pilots bring lifecycle-aware mental models to AI teammates, withholding trust until a teammate has matured (Lopez et al., The complex relationship of AI ethics and trust in human–AI teaming: insights from advanced real-world subject matter experts (2023). Third, military education and operational exposure to AI shape how trust is calibrated, and the calibration tends in the direction the reasonable-commander standard requires. The picture that emerges is not the passive rubber-stamping the critical literature warns about.

VII. The targeting staff, equipped — what this looks like in practice

The architectural reframing translates into concrete operational practice for three audiences in the targeting staff. The targeting staff itself does the analytical and validation work. The legal adviser tests the foreseeability evidence and the accountability structure. The commander makes the targeting decision the reasonable-commander standard reconstructs after the fact. AI DSS structure the work each of them does, but the judgement remains with the human at every stage. The UK Ministry of Defence’s JSP 936 Part 1 (Dependable AI in Defence, November 2024, V1.1) is the most operationally prescriptive framework currently published by any NATO ally on what this actually requires, and I draw on it throughout.

The targeting staff

The targeting staff’s work, equipped with AI DSS, is no longer concentrated at the moment of execution. It is distributed across the targeting cycle in five operational tasks. First, the targeting staff parametrises the AI DSS at Phase 1, encoding the commander’s intent and the operational design domain into the system’s permitted behaviour. JSP 936 requires the operating context for AI components to be clearly defined and communicated to risk owners and operators (JSP 936, at para. 75). Second, the targeting staff validates AI DSS outputs at Phase 2 target development. Third, the targeting staff calibrates uncertainty bands at Phase 3 capabilities analysis, which is the work CSET’s Pillar 2 identifies as essential when adversaries conceal valuable capabilities and training data is sparse. Fourth, the targeting staff monitors operational drift at Phase 5 execution. Fifth, the targeting staff observes reverberating effects against predicted ones at Phase 6 assessment, and feeds what it learns back into Phase 2 of the next iteration.

JSP 936 adds a sixth task that practitioner discourse has under-emphasised. Paragraphs 129 and 130 require training that prepares users for edge cases stressing the human-machine team, that supports reversionary workflows — meaning the re-evaluation of decisions or parameters set earlier in the cycle when the system enters conditions it was not designed for — and that enables users to calibrate their trust in the system across different use cases. The targeting staff, in other words, is not just operating the AI DSS. It is training itself, continuously, to calibrate trust at the level the lifecycle requires. The Lopez et al. finding on lifecycle-aware mental models, and the Kahn et al. finding on calibrated trust in cadet populations, are not just empirical encouragement. They are the cognitive disposition JSP 936 expects operational users to develop.

The legal adviser

For the legal adviser embedded with the targeting staff, the architectural reframing changes what the legal review is testing. The review is not asking whether a human was in the loop at the moment of execution. It is asking whether the system was designed, parametrised, validated, and monitored such that the commander’s ex ante judgement travels with the operation through to its effects. JSP 936 makes this explicit. Paragraph 56 records that humans, as unique moral agents, retain responsibility and accountability for the lawful and ethical use of AI in Defence, and paragraph 57 adds that responsibilities are distributed across the AI design and use lifecycle.

The accountability-gap argument that the critical literature raises against AI DSS does not survive contact with this framework. Paragraph 125 of JSP 936 requires that the functional allocation analysis between human and AI agents must ensure “no accountability gap” — humans remain accountable for and in control of the effects of the system (JSP 936, at para. 125b). In my view, the legal adviser’s role is therefore not to certify human-in-the-loop confirmation at execution but to test, at each phase, that the chain of human accountability is intact, documented, and traceable to the commander who authorised the operation. That is the foreseeability evidence the reasonable-commander standard’s ex ante test ultimately reconstructs.

The commander

The operational commander making cyber targeting decisions in an AI-augmented decision space is doing something different from what the conventional human-in-the-loop intuition imagines. The commander is not adjudicating AI DSS recommendations in real time. The commander is exercising “context-appropriate human involvement” (JSP 936, at para. 49 and recurring) across the cycle: setting parameters at Phase 1, validating outputs at Phases 2 and 3, selecting among prioritised options at Phase 4, monitoring drift at Phase 5, and learning from effects at Phase 6.

This is where the human judgement that post #17 identified — and that this post has now mapped to the targeting cycle — actually sits. The commander’s accountability under the reasonable-commander standard is not diminished by the AI DSS. It is relocated to the points in the cycle where the commander’s judgement can do the work the standard requires. CSET’s framework supports this directly: its recommendation that organisations document harms, build feedback mechanisms, and assign senior responsible officers maps cleanly onto JSP 936’s RAISO architecture at Section 4. The institutional infrastructure for accountable AI-augmented targeting is not hypothetical — it is being constructed, in directive form, in current NATO-aligned doctrine.

The targeting staff, equipped with AI DSS and operating under this architectural reframing, is therefore not a body where human judgement has been displaced by machine output. It is a body where human judgement has been distributed across the cycle to where it can do its work most accountably. The reasonable-commander standard holds. The targeting cycle holds. What changes is the evidence base on which the commander’s foreseeability judgement is reconstructed after the fact — and the architecture of human accountability that produces that evidence in the first place.

Conclusion

This post took up a question post #17 left unaddressed: how does AI-augmented cyber targeting change the picture of human judgement in the targeting cycle, and is “human in the loop” still the right concept when AI decision support systems sit between the commander and the foreseeability evidence base?

The answer this post has developed is that the place where human judgement is exercised does not concentrate at the moment of execution. It distributes across the targeting cycle — into the parameter-setting work the targeting staff does at Phase 1, the validation work at Phase 2 target development and Phase 3 capabilities analysis, the selection-among-options work at Phase 4, the operational monitoring at Phase 5, and the assessment-and-feedback work at Phase 6. The architectural reframing is not a retreat from human judgement. It is the relocation of that judgement to the points in the cycle where it can do the work the reasonable-commander standard requires.

What does not change is the legal standard against which the commander’s judgement is tested. The reasonable-commander standard’s ex ante test holds. What changes is the evidence base on which the standard is reconstructed — the documentation, the parameters, the validated outputs, the calibration records, the assessment data, the architecture of human accountability that JSP 936’s Ethical Principles and Human/AI Teams material now spells out in directive form. The standard is unchanged. The mechanics of how a commander demonstrates compliance with it, and how that demonstration is later reconstructed, are not.

This series will return to those mechanics directly. Post #19 of this series goes back to foundations. It works through the reasonable-commander standard as Additional Protocol I to the Geneva Conventions builds it — through the precautionary obligations distributed across Articles 57 and 58, the prohibitions on indiscriminate attacks in Article 51, and the proportionality test as it operates in practice. It examines the evidentiary mechanics of the ex ante test: what evidence actually constitutes “what a reasonable commander could have foreseen,” how operational records, legal-adviser memoranda, and documentation infrastructure feed the later reconstruction, and where the test’s weight actually falls. Only after that foundational work does post #19 turn to what the AI DSS context changes about how the standard is reconstructed in practice. The reasonable-commander standard is invoked constantly in IHL discourse. my next Post of this series unpacks it.

About the author

With more than 25 years of experience, Andreas Leupold is a lawyer trusted by German, European, US and UK clients.

He specializes in intellectual property (IP) and IT law and the law of armed conflict (LOAC). Andreas advises clients in the industrial and defense sectors on how to address the unique legal challenges posed by artificial intelligence and emerging technologies.

A recognized thought leader, he has edited and co-authored several handbooks on IT law and the legal dimensions of 3D printing/Additive Manufacturing, which he also examined in a landmark study for NATO/NSPA.

Connect with Andreas on LinkedIn