Noah Syrkis
noah[at]syrkis.com
talks
|
works
Parabellum
January 6, 2026
Thun, Switzerland
Parabellum is a large-scale, vectorized, faster than-real-time war-game developed in collaboration with the Swiss Military
PARABELLUM
Noah Syrkis
January 6, 2026
1 |
Introduction
2 |
Vectorization
3 |
Acceleration
4 |
Differentiation
5 |
Simulation
These slides presents a broad overview of the Parabellum environment shown in
[1] T. Anne
et
al.
, “Harnessing Language for Coordination: A Framework and Benchmark for LLM-Driven Multi-
Agent Control,”
IEEE Transactions on Games
, pp. 1–25, 2025, doi:
10.1109/TG.2025.3564042
.
This
research is funded by Armasuisse.
1 of
9
1 |
Introduction
▶
Sandbox for large-scale, team-based wargames on real terrain
▶
Differentiable JAX environment grounded in OpenStreetMap
▶
Built for fast iteration on autonomous tactics and analysis
2 of
9
1 |
Introduction
▶
Traditional wargaming is not ready for ML, real geography, or beyond real time simulation
▶
Manual setup slows sensitivity analysis
▶
Gradient-free simulators block learning-based planning and seamless integration with deep learn
ing pipelines (including LLMs and RL agents)
3 of
9
2 |
Vectorization
▶
Procedurally load maps + buildings for any geocoded area in a JAX
[2]
array
▶
Units, teams, sensors as YAML config-files to specify game rules and team compositions
▶
Entirely in JAX: batching, autodiff, vectorisation
▶
Parallel rollouts across seeds + scenarios on accelerators
4 of
9
3 |
Acceleration
▶
Real war: high fidelity but slow, costly, unparallelizable
▶
Parabellum: an RTS-like simulator where:
▶
Arbitrary numbers of sims can run in parallel
▶
Faster than real-time
▶
Tens of thousands of units per scenario
5 of
9
4 |
Differentiation
▶
Fully written in JAX
[2]
▶
Vectorized via
vmap
, parallelized with
pmap
▶
Direct integration into deep learning pipelines
▶
Boosts model capacity for long-horizon strategy
6 of
9
5 |
Simulation
▶
Trajectories as (
𝑠
𝑡
,
𝑎
𝑡
)–tuples
▶
No rewards
Figure 1
— only flows of state and
action
State
𝑠
𝑡
+
1
State
𝑠
𝑡
Observation
𝑜
𝑡
Action
𝑎
𝑡
Step
𝑡
Figure 1: Rewardless partially observable MDP
diagram
7 of
9
5 |
Simulation
▶
State = (position, health, cooldown)
▶
Scene encodes terrain, ranges, unit types
▶
Any Earth location loadable via OSM
1
▶
Observation = visible units’ location, health, type, team
1
OpenStreetMap data
8 of
9
References
[1]
T. Anne
et al.
, “Harnessing Language for Coordination: A Framework and Benchmark for
LLM-Driven Multi-Agent Control,”
IEEE Transactions on Games
, pp. 1–25, 2025, doi:
10.1109/
TG.2025.3564042
.
[2]
J. Bradbury
et al.
, “JAX: Composable Transformations of Python+NumPy Programs.” 2018.
9 of
9
PARABELLUM
Noah Syrkis
January 6, 2026
1 |
Introduction
2 |
Vectorization
3 |
Acceleration
4 |
Differentiation
5 |
Simulation
These slides presents a broad overview of the Parabellum environment shown in
[1] T. Anne
et
al.
, “Harnessing Language for Coordination: A Framework and Benchmark for LLM-Driven Multi-
Agent Control,”
IEEE Transactions on Games
, pp. 1–25, 2025, doi:
10.1109/TG.2025.3564042
.
This
research is funded by Armasuisse.
1 of
9
1 |
Introduction
▶
Sandbox for large-scale, team-based wargames on real terrain
▶
Differentiable JAX environment grounded in OpenStreetMap
▶
Built for fast iteration on autonomous tactics and analysis
2 of
9
1 |
Introduction
▶
Traditional wargaming is not ready for ML, real geography, or beyond real time simulation
▶
Manual setup slows sensitivity analysis
▶
Gradient-free simulators block learning-based planning and seamless integration with deep learn
ing pipelines (including LLMs and RL agents)
3 of
9
2 |
Vectorization
▶
Procedurally load maps + buildings for any geocoded area in a JAX
[2]
array
▶
Units, teams, sensors as YAML config-files to specify game rules and team compositions
▶
Entirely in JAX: batching, autodiff, vectorisation
▶
Parallel rollouts across seeds + scenarios on accelerators
4 of
9
3 |
Acceleration
▶
Real war: high fidelity but slow, costly, unparallelizable
▶
Parabellum: an RTS-like simulator where:
▶
Arbitrary numbers of sims can run in parallel
▶
Faster than real-time
▶
Tens of thousands of units per scenario
5 of
9
4 |
Differentiation
▶
Fully written in JAX
[2]
▶
Vectorized via
vmap
, parallelized with
pmap
▶
Direct integration into deep learning pipelines
▶
Boosts model capacity for long-horizon strategy
6 of
9
5 |
Simulation
▶
Trajectories as (
𝑠
𝑡
,
𝑎
𝑡
)–tuples
▶
No rewards
Figure 1
— only flows of state and
action
State
𝑠
𝑡
+
1
State
𝑠
𝑡
Observation
𝑜
𝑡
Action
𝑎
𝑡
Step
𝑡
Figure 1: Rewardless partially observable MDP
diagram
7 of
9
5 |
Simulation
▶
State = (position, health, cooldown)
▶
Scene encodes terrain, ranges, unit types
▶
Any Earth location loadable via OSM
1
▶
Observation = visible units’ location, health, type, team
1
OpenStreetMap data
8 of
9
References
[1]
T. Anne
et al.
, “Harnessing Language for Coordination: A Framework and Benchmark for
LLM-Driven Multi-Agent Control,”
IEEE Transactions on Games
, pp. 1–25, 2025, doi:
10.1109/
TG.2025.3564042
.
[2]
J. Bradbury
et al.
, “JAX: Composable Transformations of Python+NumPy Programs.” 2018.
9 of
9