Reference

This page collects the main modules that make up PyAR.

Core types

Canonical core, I/O, and sampling APIs are documented in API Reference.

Main workflows

Legacy aggregator module.

This module is retained as a compatibility shim. New code should import the workflow entrypoints from pyar.workflows.aggregate and pyar.workflows.solvation.

The shim keeps older imports working while the actual aggregation and solvation implementations live in the workflow package.

pyar.aggregator.aggregate(*args, **kwargs): Deprecated alias for pyar.workflows.aggregate.aggregate().

pyar.aggregator.aggregate_from_formulas(*args, **kwargs): Deprecated alias for pyar.workflows.aggregate.aggregate_from_formulas().

pyar.aggregator.solvate(*args, **kwargs): Deprecated alias for pyar.workflows.solvation.solvate().

pyar.aggregator.add_one(aggregate_id, seeds, monomer, hm_orientations, qc_params, maximum_number_of_seeds, site)

pyar.aggregator.check_for_the_finished_jobs_on_restart(list_of_optimized_molecules, cwd)

pyar.aggregator.check_stop_signal()

pyar.aggregator.expand_formula_to_aggregate_inputs(formula): Convert a formula into aggregate input symbols and multiplicities.

pyar.aggregator.generate_molecule_from_formula(formula, box_size=None, rng=None): Generate a molecule object from a chemical formula.

pyar.aggregator.generate_orientations(num_orientations, mol_id, monomer, seed_counter, seeds, site)

pyar.aggregator.old_path_to_new_path(monomers_to_be_added, old_path)

pyar.aggregator.read_old_path(): Read restart pathway markers from a prior log, if one exists.

pyar.aggregator.read_orientations(molecule_id, noo)

pyar.aggregator.select_pathways(monomers_to_be_added, number_of_pathways)

pyar.aggregator.update_id(aid, the_monomer): Increment one monomer count inside an aggregate identifier.

Importable pyar-explore entrypoint.

This script generates candidate composite geometries by combining a seed molecule, a monomer, and a formula-derived pathway. It is a convenience driver around the trial-generation and molecule-merging helpers used by the aggregation workflow.

pyar.scripts.explore.parse_arguments(): Parse the pyar-explore command-line arguments.

pyar.scripts.explore.parse_formula(formula): Parse a chemical formula into an ordered atom-count mapping.

pyar.scripts.explore.generate_chemical_pathway(atom_counts, seed): Return the list of atoms that still need to be added to the seed.

pyar.scripts.explore.create_composite_molecule_wrapper(seed, monomer, pathway, sequence_offset=0)

Create one candidate composite geometry for a pathway.

The wrapper seeds the orientation generator with sequence_offset so repeated populations are reproducible while still producing distinct geometry sets.

pyar.scripts.explore.main(): Generate one or more composite geometries from a formula.

Importable pyar-optimiser entrypoint.

This utility runs the general-purpose geometry optimizer over one or more input structures without going through the aggregation, reaction, or solvation workflow state machines.

pyar.scripts.optimiser.main(): Parse optimizer options, load input molecules, and run bulk optimization.

Importable pyar-react entrypoint.

This script provides the historical reaction-search command-line interface that wraps pyar.workflows.reaction.react(). It converts CLI arguments into the backend parameter dictionary expected by the workflow and applies the same geometry-optimizer guardrails as the main pyar-cli entrypoint.

pyar.scripts.react.argument_parse(): Parse the reaction-search command-line arguments.

pyar.scripts.react.calculate_index_from_xyz(filename): Return the zero-based index separating the two reactants in filename.

pyar.scripts.react.main(): Run the legacy reaction-search command-line workflow.

Standalone reaction-trace analysis and plotting entrypoint.

pyar.scripts.reaction_trace.argument_parse(argv=None): Parse reaction-trace CLI arguments.

pyar.scripts.reaction_trace.main(argv=None): Run trace analysis and optionally emit plot artifacts.

Importable pyar-energy-table entrypoint.

This utility prints a relative-energy table for one or more XYZ files. It is used as a lightweight inspection tool for comparing geometries without running a full workflow.

pyar.scripts.energy_table.main(): Read XYZ files, attach energies, and print a ranked energy table.

Command-line interface for trial geometry generation.

This utility exposes the orientation-sampling and fragment-merging helpers used by the aggregation and reaction workflows. It can generate approach directions, create candidate configurations, and export simple trial spheres.

pyar.scripts.trial_generation.main(): Generate trial placements or composite configurations.

Importable pyar-clustering entrypoint.

This utility clusters or filters XYZ pools using the selection algorithms implemented in pyar.data_analysis.clustering. It prints the energy table for the input pool and then emits the selected geometries.

pyar.scripts.clustering.main(): Cluster or filter the provided XYZ pool and print the selected files.

Importable pyar-similarity entrypoint.

This utility computes Grigoryan-Springborg similarity across a pool of XYZ files, writes duplicate and unique structure collections, and records a summary file for the rejected pairs.

pyar.scripts.similarity.read_file(input_file)

pyar.scripts.similarity.Euclidean_distance(p1, p2, p3, axis_x, axis_y, axis_z)

pyar.scripts.similarity.average(data)

pyar.scripts.similarity.Grigoryan_Springborg(numb_atoms, array_coord_x_1, array_coord_y_1, array_coord_z_1, array_coord_x_2, array_coord_y_2, array_coord_z_2)

pyar.scripts.similarity.uniq(lst)

pyar.scripts.similarity.index_elements(duplicate_name, files_name)

pyar.scripts.similarity.process_files(x, array_keys, Info_Coords, threshold_duplicate, file_tmp, file_log, files_xyz)

pyar.scripts.similarity.main(): Scan XYZ files for near-duplicate structures and write summary files.

Importable pyar-descriptor entrypoint.

This utility computes a compact cluster-shape descriptor for each XYZ file and writes per-structure descriptor files alongside summary CSV output.

pyar.scripts.descriptor.calculate_properties(atoms): Return the basic shape descriptors used to characterize one cluster.

pyar.scripts.descriptor.create_combined_descriptor(properties): Combine the basic descriptors into a single normalized score.

pyar.scripts.descriptor.main(args=None): Compute and write descriptors for one or more XYZ files.

Benchmark clustering/selection algorithms on one or more XYZ pools.

This utility compares the geometry-selection algorithms exposed by pyar.data_analysis.clustering and reports selection quality, diversity, and runtime for one or more pools of XYZ files.

pyar.scripts.benchmark_clustering.main()

Benchmark direction samplers used for trial geometry placement.

The benchmark compares the available placement-direction generators used by the trial-geometry code paths and reports sphere-coverage statistics plus runtime.

pyar.scripts.benchmark_orientations.main(): Run reproducible unit-sphere coverage comparisons.

Structured result objects returned by public workflow entrypoints.

class pyar.workflow_results.WorkflowResult(workflow, status, run_directory, state_path=None, selected_paths=(), metadata=<factory>)

Bases: object

Common metadata for workflow outcomes.

Parameters:

workflow (str)
status (str)
run_directory (str)
state_path (str | None)
selected_paths (tuple[str, ...])
metadata (Mapping[str, Any])

workflow: str

status: str

run_directory: str

state_path: str | None = None

selected_paths: tuple[str, ...] = ()

metadata: Mapping[str, Any]

to_dict(): Return a plain dictionary for logging or serialization.

class pyar.workflow_results.AggregateResult(workflow, status, run_directory, state_path=None, selected_paths=(), metadata=<factory>)

Bases: WorkflowResult

Structured result for aggregation runs.

Parameters:

workflow (str)
status (str)
run_directory (str)
state_path (str | None)
selected_paths (tuple[str, ...])
metadata (Mapping[str, Any])

class pyar.workflow_results.SolvationResult(workflow, status, run_directory, state_path=None, selected_paths=(), metadata=<factory>)

Bases: WorkflowResult

Structured result for solvation runs.

Parameters:

workflow (str)
status (str)
run_directory (str)
state_path (str | None)
selected_paths (tuple[str, ...])
metadata (Mapping[str, Any])

class pyar.workflow_results.ReactionResult(workflow, status, run_directory, state_path=None, selected_paths=(), metadata=<factory>)

Bases: WorkflowResult

Structured result for reaction runs.

Parameters:

workflow (str)
status (str)
run_directory (str)
state_path (str | None)
selected_paths (tuple[str, ...])
metadata (Mapping[str, Any])

Shared backend helpers and public re-exports for PyAR.

class pyar.backends.SF(molecule)

Bases: object

Base state for PyAR interface workflows that write XYZ files.

pyar.backends.require_executable(program, friendly_name=None): Return an executable path or raise a clear FileNotFoundError.

pyar.backends.which(program): Return the absolute path to an executable if it exists on PATH.

pyar.backends.write_xyz(atoms_list, coordinates, filename, job_name='no_name', energy=0.0): Write a simple XYZ file with an optional energy label.

pyar.backends.run_command(command, stdout_path=None, stderr_path=None, stdin_path=None)

Run a command and return its exit status.

When stdout and stderr should go to the same file, pass the same path for both arguments.

pyar.backends.run_output(command, stderr_path=None)

Run a command and return its stdout bytes.

Raises:: subprocess.CalledProcessError – If the command exits with a non-zero status.

orca.py - backend for the ORCA program

This file is part of the pyar project.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation version 2 of the License.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

class pyar.backends.orca.Orca(molecule, qc_params)

Bases: SF

ORCA single-point and optimization wrapper for PyAR.

prepare_input(): Write the ORCA input file for the current molecule.

optimize()

Optimize the current structure with ORCA.

Returns:: True when ORCA finishes normally, otherwise False.
Return type:: bool

get_energy(): Return the final single-point energy in Hartree.

pyar.backends.orca.main(): Module entrypoint retained for parity with the other interfaces.

ORCA followed by AIQM1 backend implementation.

class pyar.backends.orca_aiqm1.OrcaAIQM1(molecule, qc_params)

Bases: SF

prepare_keyword(qc_params)

set_gamma(gamma)

prepare_input()

optimize()

get_energy()

pyar.backends.orca_aiqm1.main()

psi4.py - backend for the Psi4 program

This file is part of the PyAR project.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation version 2 of the License.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

class pyar.backends.psi4.Psi4(molecule, method, custom_keyword=None)

Bases: SF

Psi4 optimization wrapper for PyAR.

prepare_input(keyword=''): Write the Psi4 input file for the current molecule.

optimize(max_cycles=350, gamma=0.0, restart=False, convergence='normal')

Optimize the current structure with Psi4.

Returns:: True when Psi4 finishes normally, otherwise False.
Return type:: bool

get_energy(): Return the final Psi4 energy in Hartree.

get_coordinates(): Return the optimized Cartesian coordinates from the Psi4 log.

pyar.backends.psi4.main(): Run the Psi4 workflow from the command line.

gaussian.py - backend for the Gaussian program

This file is part of the pyar project.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation version 2 of the License.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

class pyar.backends.gaussian.Gaussian(molecule, qc_params)

Bases: SF

Gaussian single-point and optimization wrapper for PyAR.

prepare_input(): Write the Gaussian input file for the current molecule.

optimize()

Optimize the current structure with Gaussian.

Returns:: True when Gaussian finishes normally, otherwise False.
Return type:: bool

get_coords(): Return the optimized Cartesian coordinates from the Gaussian log.

get_energy(): Return the final SCF energy in Hartree.

pyar.backends.gaussian.main(): Module entrypoint retained for parity with the other interfaces.

babel.py - OpenBabel backend helpers for PyAR

This file is part of the pyar project.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation version 2 of the License.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

class pyar.backends.babel.OBabel(molecule, forcefield=None)

Bases: SF

OpenBabel-backed XYZ optimization helper for PyAR.

optimize(max_cycles=350, gamma=0.0, restart=False, convergence='normal'): Optimize the current structure with OpenBabel.

get_coords(): Return the optimized Cartesian coordinates from the XYZ file.

get_energy(): Return the OpenBabel energy in Hartree if the calculation completed.

pyar.backends.babel.xyz_to_mopac_input(xyzfile, mopac_input_file, keyword=None): Convert an XYZ file to a Mopac input file with OpenBabel.

pyar.backends.babel.xyz_to_sdf_file(xyz_input_files, sdf_output_file): Convert one or more XYZ files to a single SDF file.

pyar.backends.babel.make_inchi_string_from_xyz(xyzfile): Return an InChI string for a molecule stored in an XYZ file.

pyar.backends.babel.make_smile_string_from_xyz(xyzfile): Return a SMILES string for a molecule stored in an XYZ file.

pyar.backends.babel.main(input_files): Run the OpenBabel optimization workflow for one or more XYZ files.

mopac.py - backend for the MOPAC program

This file is part of the PyAR project.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation version 2 of the License.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

class pyar.backends.mopac.Mopac(molecule, qc_params)

Bases: SF

MOPAC optimization wrapper for PyAR.

prepare_input(keyword=''): Create the MOPAC input file for the current structure.

optimize(max_cycles=350, gamma=0.0, restart=False, convergence='normal')

Optimize the current structure with MOPAC.

Returns:: True when MOPAC finishes normally, otherwise False.
Return type:: bool

get_energy(): Return the MOPAC heat of formation converted to Hartree.

get_coords(): Return the optimized Cartesian coordinates from the ARC file.

pyar.backends.mopac.main(): Module entrypoint retained for parity with the other interfaces.

TorchANI-based ANI backend and CLI wrapper.

exception pyar.backends.ani.ANICalculationFailed

Bases: Exception

Raised when an ANI model lookup, evaluation, or workflow step fails.

class pyar.backends.ani.ANI(*args, **kwargs)

Bases: Calculator

ASE calculator backed by a TorchANI potential.

implemented_properties = ['energy']

calculate(atoms=None, properties=('energy',), system_changes=ase.calculators.calculator.all_changes): Compute the potential energy for an ASE atoms object.

class pyar.backends.ani.ANIInterface(xyzfile)

Bases: object

Small convenience wrapper for single-point and optimization jobs.

optimize(model='ANI-1x', fmax=0.001, trajectory='optimization.traj'): Run an ASE geometry optimization and return the final energy.

singlepoint(model='ANI-1x'): Return a single-point ANI energy for the loaded structure.

pyar.backends.ani.main(): Run the ANI helper CLI.

AIMNet2 backend implementation.

class pyar.backends.aimnet_2.Aimnet2(molecule, qc_params)

Bases: SF

prepare_input()

optimize()

Returns:: True, ‘SCFFailed’, ‘GradFailed’, ‘UpdateFailed’, ‘CycleExceeded’, False

property optimized_coordinates

property energy

pyar.backends.aimnet_2.main()

AIQM1/MLatom backend implementation.

class pyar.backends.aiqm1_mlatom.AIQM1(molecule, qc_params)

Bases: SF

prepare_input()

optimize()

Returns:: True, ‘SCFFailed’, ‘GradFailed’, ‘UpdateFailed’, ‘CycleExceeded’, False

property optimized_coordinates

property energy

pyar.backends.aiqm1_mlatom.main()

MLatom AIQM1 backend implementation.

class pyar.backends.mlatom_aiqm1.MlatomAiqm1(molecule, qc_params=None)

Bases: SF

prepare_input()

read_molecule_from_file(filename)

optimize()

get_coords()

Returns:: coords It will return coordinates

get_energy()

Returns:: This object will return energy from an gaussian calculation. It will return energy in Hartree units.

pyar.backends.mlatom_aiqm1.main()

MLatom optimization backend command-line entrypoint.

pyar.backends.mlopt.main()

Versioned, inspectable restart state for aggregation workflows.

exception pyar.state.aggregate.AggregateStateError

Bases: RuntimeError

Raised when an aggregation run cannot be safely resumed.

class pyar.state.aggregate.AggregateRunState(root_directory, data)

Bases: object

Persist completed aggregation pathways independently of optimizer state.

classmethod create(root_directory, request, pathway_labels, legacy_import=None): Create and persist state for a new or imported aggregation run.

classmethod load(root_directory, expected_request): Load a running aggregation state file and validate its request.

validate_request(expected_request): Reject resume attempts that change the aggregation calculation.

validate_progress(): Reject incomplete or inconsistent persisted pathway progress.

completed_pathway_count(): Return the number of pathway builds already completed.

remaining_pathway_labels(): Return persisted pathways that still need to be calculated.

complete_pathway(label, selected_results): Persist completion and selected result paths for the next pathway.

finish(final_selected_results): Persist terminal workflow state and final selected output paths.

save(): Atomically write aggregates/state.json.

Versioned, inspectable restart state for reaction workflows.

exception pyar.state.reaction.ReactionStateError

Bases: RuntimeError

Raised when a reaction run cannot be created or safely resumed.

class pyar.state.reaction.ReactionRunState(root_directory, data)

Bases: object

Persist reaction workflow progress independently of optimizer objects.

classmethod create(root_directory, request, orientations, reactants): Create and persist state for a fresh reaction calculation.

classmethod load(root_directory, expected_request): Load a current-format reaction state file and validate its request.

classmethod migrate_legacy(root_directory, checkpoint, request): Convert an unambiguous legacy jobs.pkl checkpoint to JSON state.

validate_request(expected_request): Fail clearly when a resume invocation changes scientific settings.

remaining_gamma_schedule(): Return the numeric gamma values still to be processed.

pending_molecules(): Load the geometries still pending in the current gamma cycle.

current_survivor_molecules(): Load successful current-cycle candidates retained before interruption.

saved_product_identities(): Return saved product identifiers for restart-time deduplication.

record_job(job_name, gamma, status, remaining_orientations, current_survivors): Record processed work plus pending and retained current-cycle candidates.

record_product(job_name, gamma, inchi, smiles, path, trace_summary=None): Record and immediately persist one newly discovered product.

complete_cycle(gamma, next_orientations): Mark a gamma cycle complete and snapshot candidates for the next cycle.

finish(status='completed'): Persist terminal workflow state while retaining the run record.

save(): Atomically write reaction/state.json.

pyar.state.reaction.read_legacy_checkpoint(root_directory): Read a legacy jobs.pkl only for one-time migration to JSON state.

Versioned, inspectable restart state for solvation workflows.

The solvation workflow persists a compact JSON state file plus geometry snapshots so interrupted runs can resume without redoing completed cycles or losing the selected seed geometries.

exception pyar.state.solvation.SolvationStateError

Bases: RuntimeError

Raised when a solvation run cannot be safely resumed.

class pyar.state.solvation.SolvationRunState(root_directory, data)

Bases: object

Persist completed solvation cycles independently of optimizer objects.

The state tracks the requested solvation calculation, the current seed geometries, the completed cycle history, and the terminal output paths. It is designed to be human-inspectable and safe to validate before a resumed run mutates the working directory.

classmethod create(root_directory, request, current_seeds)

Create and persist state for a new solvation run.

The initial state stores the request payload, snapshots of the input seed geometries, and the cycle counter used to drive the growth loop.

classmethod load(root_directory, expected_request)

Load a running solvation state file and validate its request.

A matching request is required for restart. If the state file belongs to a different calculation, the load is rejected rather than silently mutating an unrelated run directory.

validate_request(expected_request): Reject resume attempts that change the solvation calculation.

validate_progress(): Reject incomplete or inconsistent persisted cycle progress.

current_seed_molecules(): Load seed geometries to be used for the next solvation cycle.

complete_cycle(cycle_number, selected_molecules)

Persist a completed cycle and update seeds for the next cycle.

The selected molecules are snapshotted to disk so the next cycle can be resumed from an inspectable geometry set rather than from in-memory optimizer objects.

finish(final_status='completed'): Persist terminal workflow state and final selected output paths.

save(): Atomically write solvation/state.json.

Aggregate workflow orchestration for PyAR.

This module owns the aggregation and cluster-generation workflow used by pyar-cli -a and the legacy aggregator compatibility layer. It is responsible for:

validating the aggregate request and restart state
selecting pathways and orientation sets
creating the working directory tree under aggregates/
running the backend optimization steps for each pathway
collecting the selected geometries and workflow result metadata

The public entry points are aggregate() and aggregate_from_formulas().

pyar.workflows.aggregate.aggregate(molecules, aggregate_sizes, hm_orientations, qc_params, maximum_number_of_seeds, first_pathway, number_of_pathways, site)

Run an aggregate or cluster-generation workflow.

The function creates or resumes the aggregation state, iterates over the selected pathways, runs the per-orientation optimization for each path, and returns an AggregateResult describing the final run state.

pyar.workflows.aggregate.aggregate_from_formulas(formulas, aggregate_sizes, hm_orientations, qc_params, maximum_number_of_seeds, first_pathway, number_of_pathways, site)

Generate initial molecules from formulas and run the aggregate workflow.

This is the convenience entry point used by --formula aggregation runs. It converts the provided formulas into molecule objects and then delegates to aggregate().

pyar.workflows.aggregate.expand_formula_to_aggregate_inputs(formula): Convert a formula into aggregate input symbols and multiplicities.

pyar.workflows.aggregate.generate_molecule_from_formula(formula, box_size=None, rng=None): Generate a molecule object from a chemical formula.

AFIR reaction-path analysis for recorded geomeTRIC traces.

The analysis layer reads the JSONL/XYZ trace emitted by the reaction workflow, writes a CSV summary, exports candidate geometries for the most interesting path points, and can generate standalone plots from the same recorded trace.

pyar.reaction_analysis.analyse_reaction_trace(job_directory)

Write a compact summary and candidate geometries for one reaction path.

job_directory may point either to the workflow root or directly to a reaction_trace/ directory. The function writes path_summary.csv in the job root, exports the candidate XYZ files under candidate_ts/, and returns a metadata dictionary describing the selected points.

pyar.reaction_analysis.plot_reaction_trace(job_directory, output_directory=None)

Create PNG plots from a recorded reaction trace.

The plots summarize the relative energies, bond-change count, inter- fragment distance, and force norms across the recorded steps. When no output directory is supplied, the figures are written to trace_plots/ in the job root.

Reaction-path trace recording for geomeTRIC-backed AFIR optimizations.

The recorder writes a JSONL trace plus per-step XYZ snapshots for each geomeTRIC-backed reaction evaluation. The resulting trace is the input for the analysis and plotting helpers in pyar.reaction_analysis.

pyar.reaction_trace.infer_bonds(symbols, coordinates_angstrom, previous_bonds=None)

Infer a conservative bond set from coordinates and covalent radii.

The heuristic is intentionally cautious: new bonds are only introduced when the interatomic distance is comfortably below the covalent-radius threshold, while previously observed bonds are allowed a wider hysteresis window so small numerical changes do not flicker the bond list.

pyar.reaction_trace.bond_changes(previous_bonds, current_bonds): Return formed and broken bonds between consecutive bond sets.

pyar.reaction_trace.min_interfragment_distance(coordinates_angstrom, fragment_indices)

Return the minimum distance between atoms in distinct fragments.

The value is only computed when at least two non-empty fragments are available. Otherwise the trace record stores None for this field.

pyar.reaction_trace.validate_trace_record(record, record_index)

Validate the structure of one trace record.

The validator enforces the schema used by the analysis and plotting code and returns a normalized record with numeric fields converted to native Python types.

class pyar.reaction_trace.ReactionTraceRecorder(job_directory, trace_name='reaction_trace', mode='write')

Bases: object

Append geomeTRIC evaluation data to JSONL and XYZ trace files.

The recorder is stateful: it tracks the next step index, reconstructs the previous bond set when appending to an existing trace, and keeps the JSONL trace and XYZ snapshots synchronized on disk.

record(*, symbols, coordinates_angstrom, backend_energy_hartree, afir_energy_hartree, total_energy_hartree, backend_forces_hartree_per_bohr, afir_forces_hartree_per_bohr, total_forces_hartree_per_bohr, backend_force_norm, afir_force_norm, total_force_norm, max_force, fragment_indices=None)

Write one trace record and its corresponding XYZ snapshot.

The recorded JSON object includes the raw energies/forces, bond-change counts, and interfragment distance used by the downstream analysis tools. The companion XYZ snapshot preserves the optimized geometry for the same step index.

pyar.reaction_trace.load_trace_records(trace_file)

Load and validate JSONL trace records from trace_file.

trace_file may be either the JSONL file itself or the containing reaction_trace/ directory. Each line is validated and normalized before being returned, so callers can rely on the trace schema.

Reaction workflow orchestration for PyAR.

This module owns the AFIR reaction-search pipeline:

build the numeric gamma schedule
prepare and persist restartable reaction state
generate trial orientations for each gamma cycle
optimize each orientation with the selected backend
perform unbiased relaxation when a bonded candidate is found
deduplicate and persist unique products
emit trace-analysis summaries for successful paths

The module is the implementation behind the legacy pyar.reactor import path and the pyar-react command-line entry point.

pyar.workflows.reaction.print_header(gamma_max, gamma_min, hm_orientations, software)

Log the fixed header used at the start of a reaction run.

The header mirrors the historical reactor logging style and gives the caller a single place to confirm the gamma range, orientation count, and backend before the workflow begins mutating the working directory.

pyar.workflows.reaction.with_gamma(qc_params, gamma)

Return a copy of qc_params with a specific AFIR gamma applied.

The helper also enables trace recording only for geomeTRIC-backed runs with a non-zero gamma, because those are the only configurations that produce the path trace used by the analysis tooling.

pyar.workflows.reaction.without_afir_bias(qc_params)

Return parameters for unbiased physical relaxation.

This is the relaxation step applied after a bonded AFIR candidate has been identified. It preserves the physical backend configuration while forcing gamma=0.0 so the candidate can be re-optimized without the AFIR bias.

pyar.workflows.reaction.build_gamma_schedule(gamma_min, gamma_max, steps=10)

Build the numeric AFIR gamma schedule used by the reaction workflow.

The schedule is inclusive and monotonic. A single-valued schedule is returned when the limits are equal; otherwise the workflow uses evenly spaced values between the endpoints.

pyar.workflows.reaction.format_gamma_id(gamma)

Format a gamma value for directory names and readable job labels.

The formatter keeps directory names stable and lexicographically useful by zero-padding the integral part and encoding the decimal separator as p.

pyar.workflows.reaction.build_reaction_request(reactant_a, reactant_b, gamma_list, hm_orientations, qc_params, site, proximity_factor)

Build the scientific request persisted with reaction restart state.

The request captures the scientifically relevant inputs that must remain fixed across restarts: the gamma schedule, the backend configuration, the selected site constraint, the proximity factor, and a signature of both reactants.

pyar.workflows.reaction.relax_without_afir_bias(molecule, qc_params)

Relax a bonded AFIR candidate on the unbiased physical objective.

The molecule is written to temporary XYZ snapshots before and after the relaxation so callers can inspect the pre- and post-relaxation geometries if the optimization succeeds.

pyar.workflows.reaction.initialize_reaction_run(reactant_a, reactant_b, gamma_min, gamma_max, hm_orientations, qc_params, site, proximity_factor)

Prepare a reaction run and return the mutable workflow state.

This function owns the restart contract for the reaction workflow. It either resumes an existing reaction state, migrates a legacy checkpoint, or creates a new reaction/ tree with all trial geometries staged for the first gamma cycle.

pyar.workflows.reaction.react(reactant_a, reactant_b, gamma_min, gamma_max, hm_orientations, qc_params, site, proximity_factor)

Run the reaction-search workflow for two reactants.

The workflow iterates over the gamma schedule, optimizes each orientation, records products and trace summaries, and returns a structured result that summarizes the final reaction state.

pyar.workflows.reaction.optimize_all(gamma_id, orientations, run_state, product_dir, qc_param)

Optimize all trial geometries for one gamma cycle.

Each orientation is written to its own job directory, optimized with the current gamma, and then either retained for the next gamma value or promoted to a unique product if it survives the unbiased relaxation step.

pyar.workflows.reaction.main(): Compatibility entry point retained for older imports and scripts.

Solvation workflow orchestration for PyAR.

This module owns the growth workflow behind pyar-cli --solvate. It persists a restartable solvation state, generates trial orientations for each cycle, and returns a structured result that describes the final selected seed geometries.

pyar.workflows.solvation.solvate(seeds, monomer, aggregate_size, hm_orientations, qc_params, maximum_number_of_seeds, site=None)

Run the solvation workflow for the provided seeds and solvent.

The workflow either resumes an existing solvation/ state tree or creates a new one, then iterates over the requested growth cycles while recording the seed geometries selected for the next cycle. The returned SolvationResult describes the final run state and the output directory.

Compatibility wrapper for pyar.state.solvation.

Legacy code imported solvation restart-state classes from pyar.solvation_state. The real implementation now lives in pyar.state.solvation, and this module re-exports it so the old import path remains valid.

Optimiser Module

Functions

optimise(molecule, qc_params) write_csv(csv_file, energy_dit) bulk_optimize(input_files, qc_params)

pyar.optimiser.is_success(status)

pyar.optimiser.is_cycle_exceeded(status)

pyar.optimiser.is_usable(status)

pyar.optimiser.apply_geometry_result(molecule, geometry)

pyar.optimiser.build_geometry(molecule, qc_params): Create the interface wrapper for a configured software backend.

pyar.optimiser.optimise(molecule, qc_params): Run one optimization job and return the backend status object.

pyar.optimiser.write_csv_file(csv_filename, energy_dict)

pyar.optimiser.bulk_optimize(input_molecules, qc_params): Optimize molecules and keep only usable results.

pyar.optimiser.main()

pyar.scan.generate_guess_for_bonding(molecule_id, seed, monomer, a, b, number_of_orientations, d_scale)

pyar.scan.ab_dist(a, b, monomer, seed, pts)

pyar.scan.generate_guess_for_bonding_brute_force(molecule_id, seed, monomer, a, b, number_of_orientations, d_scale)

pyar.scan.scan_distance(input_molecules, site_atoms, number_of_orientations, quantum_chemistry_parameters)

Supporting utilities

Calculate or retrieve properties of the input molecules.

pyar.property.distance(coords_a, coords_b)

Calculates distance between atoms a and b.

Parameters:

coords_a (ndarray(float)) – Coordinate of atom a
coords_b (ndarray(float)) – Coordinate of atom b

Returns:

The distance between atom a and b

Return type:

float

pyar.property.get_centroid(coordinates)

pyar.property.get_centre_of_mass(coordinates, atomic_mass)

pyar.property.get_average_radius(coordinates, centroid)

pyar.property.get_std_of_radius(coordinates, centroid)

pyar.property.get_principal_axes(moments_of_inertia_tensor)

pyar.property.get_distance_list(coordinates)

pyar.property.get_distance_matrix(coordinates)

pyar.property.get_bond_matrix(coordinates, covalent_radius): return bond matrix

pyar.property.get_connectivity(coordinates, covalent_radius): return connection graph

pyar.property.calculate_angle(a1, b1, c1)

pyar.property.hydrogen_bond_analysis(coordinates, covalent_radius, atomic_number, atoms_list): return bond matrix

pyar.property.calculate_dihedral(p0, p1, p2, p3)

Selection services

Persistence helpers for basin selection memory.

pyar.selection.basin_memory.record_selected_basins(selected_molecules, output_root='selected'): Persist final selected geometries in the per-stoichiometry basin registry.

Deduplication and connectivity filtering for selected geometries.

pyar.selection.deduplication.calc_fingerprint_distance(a, b): Calculate the distance between two fingerprints.

pyar.selection.deduplication.remove_similar(list_of_molecules): Remove near-duplicate geometries from a candidate pool.

Diversity selection helpers.

pyar.selection.diversity.print_energy_table(molecules, stream=None, title=None): Report energies with relative values against the global minimum.

Clustering algorithms and cluster representatives for selection.

pyar.selection.clustering.determine_dbscan_params(dt)

pyar.selection.clustering.generate_labels(dt, algorithm='hdbscan', maximum_number_of_seeds=8)

pyar.selection.clustering.kmeans_clustering(dt, n_clusters)

pyar.selection.clustering.dbscan_clustering(dt)

pyar.selection.clustering.optics_clustering(dt)

pyar.selection.clustering.hdbscan_clustering(dt)

pyar.selection.clustering.affinity_propagation_clustering(dt)

pyar.selection.clustering.mean_shift_clustering(dt)

pyar.selection.clustering.spectral_clustering(dt, n_clusters)

pyar.selection.clustering.agglomerative_clustering(dt, n_clusters)

pyar.selection.clustering.gaussian_mixture_clustering(dt, n_components)

pyar.selection.clustering.rbf_kernel_clustering(dt, threshold=0.99)

pyar.selection.clustering.get_the_best_molecule(list_of_molecules)

pyar.selection.clustering.print_energy_table(molecules, stream=None, title=None): Report energies with relative values against the global minimum.

pyar.selection.clustering.select_best_from_each_cluster(labels, list_of_molecules)

For xTB-specific APIs, see API Reference.