Aggregation and Cluster Search

Use aggregation when you want PyAR to build and screen low-energy structures from one or more molecular fragments. This is the main task for noncovalent complexes, molecular clusters, weakly bound assemblies, and formula-based structure generation.

Typical chemistry questions

Aggregation is useful when you want to ask questions such as:

What are plausible low-energy structures of a molecular dimer or cluster?
How can two or more fragments pack through hydrogen bonding, dispersion, or ion-pairing interactions?
Which structures should be selected for a later xTB, ORCA, Gaussian, or ML refinement?
Can I generate candidate structures from a formula before doing expensive calculations?

Basic commands

Aggregate two one-atom fragments:

pyar-cli aggregate C H -as 1 4 -N 8
pyar-cli -a C H -as 1 4 -N 8

Generate trial structures directly from a formula:

pyar-cli --aggregate --formula C5H4 -N 8

Run a fragment-cluster search with a backend optimizer:

pyar-cli -s water.xyz water.xyz --software xtb -ss 10 -N 16 -c 0 0 -m 1 1

How to think about the output

For most chemistry users, the important outputs are the selected structures and their energies. PyAR removes near-duplicates and keeps a smaller set of candidate geometries for inspection or higher-level refinement.

A typical aggregation run creates a directory structure like:

aggregates/
  state.json
  ag_.../
    selected/
  selected/
    stoichiometry_.../

Useful files to inspect:

aggregates/state.json records the request, restart state, and provenance.
selected/ contains the selected candidate structures.
Energy-table output helps rank structures by relative energy.

Restart behaviour

Aggregation restart state is stored as readable JSON. Re-running an interrupted aggregation with the same request resumes unfinished pathways while reusing existing step outputs. Older pyar.log pathway markers are imported once into JSON state when a legacy aggregates/ calculation is resumed.

Next steps

After an aggregation run, common follow-up steps are:

run pyar-energy-table on selected structures
cluster or deduplicate the final candidates
refine selected structures with a higher-level backend
use selected aggregates as input for reaction, solvation, or external DFT calculations