mspasspy.history

class mspasspy.history.HistoryLogger(db, job=0)[source]

Bases: object

Base class for generic, global history/provenance preservation in MsPASS. The main concept of this object that a pymongo script to run a processing job would create this object or one of it’s children to preserve the global run parameters for the a processing sequence. We limit that to mean a sequence of processing algorithms that have a set of predefined parameters that control their behaviour. The global parameters are preserved in a special collection in MongoDB we give the (fixed) name of “history”.

register(alg, partype, params)[source]

Register an algorithm’s signature to preserve processing history. Each algorithm in a processing chains should be registered by this mechanism before starting a mspass processing chain. The register method should be called in the order in which the algorithms are applied.

Parameters:
  • alg – is the name of the algorithm that will be run. Assumed to be a string.

  • partype – defines the format of the data defining input parameters to this algorithm (Must be either ‘dict’ or ‘AntelopePf’)

  • params – is the actual input data. Actual type of this data this arg references will depend up partype. partype defines the type of the object expect (dict in this case means a python dict object)

Raise:

Throws a RuntimeError with a message if partype is not on the list of supported parameter types

save()[source]

Save the contents to the history collection.

The doc created in a save is more or less an image of the structure of this object translated to a python dict

class mspasspy.history.basic_history_data(job)[source]

Bases: object

This is a pure data object that if it were written in C could be defined as a struct. It holds the data used to define the parameters for a given algorithm.

load_algorithm_args(alg, argdict)[source]

Loads parameters defined to a set of function arguments.

Simple algorithms without a lot of parameters often simply need a set of argument values. Here we require this to be defined by a set of key:value pairs that map to dict. We also consider this the lowest common denominator for a parameter definition so make it a part of the base class.

Parameters:
  • alg – This should be a string defining the algorithm being registered.

  • argdict – This should be a dict of key:value pairs defining input parameters. For algorithms defined at the top level by a python function this should match the names of parameters in the arg list. For C++ functions wrapped with pybind11 it should match the arg keys defined in the wrappers. The key string will be used for key:value pair in BSON written to MongoDB.

mspasspy.history.get_jobid(db)[source]

Ask MongoDB for the a valid jobid.

All processing jobs should have a call to this function at the beginning of the job script. It simply queries MongoDB for the largest current value of the key “jobid” in the history collection. If the history collection is empty it returns 1 under a bias that a jobid of 0 is illogical.

Parameters:

db (top level database handle returned by a call to MongoClient.database) – database handle

class mspasspy.history.pf_history_data(job, alg, pf)[source]

Bases: basic_history_data

Loads history data container with data from an AntelopePf object.

mspasspy.ccore.utility defines the AntelopePf object that is an option for parameter inputs. The file structure is identical to the Antelope Pf file syntax. The API to an AntelopePF is not, however, the same as the python bindings in Antelope as it handles Tbl and Arr sections completely differently more in line with alternatives like YAML. This method converts the data in an AntelopePf to a python dict that can be dumped directly to MongoDB with pymongo’s insert methods. Converting the MongoDB document back to a pf structure requires the inverse operator that does not exist, but should eventually be created if this approach sees extensive use.

mspasspy.history.pfbranch_to_dict(pf, key)[source]

Recursive function to convert a single branch in an AntelopePf to a python dict.

This function utilizes recursion to follow a chain of arbitrary length of branches defined in an AntelopePf object. Result is a dict with a chain of dicts of the same length. i.e. if AntelopePf has 3 levels of branches the dict will have a 3 levels of associative arrays keyed by the same branch names as the Arr items in the original Pf file. Note this should be called from the top level one branch at a time. i.e. for the parent AntelopePf this function should be called once for each returned key by pf.arr_keys().

Note that at each level Tbl& sections of the original pf are parsed to be converted to lists of strings with each line of the Tbl section being one string in the list.

Parameters:
  • pf – is an AntelopePf. Recursive calls use get_branch outputs that return one of these.

  • key (string) – key used to access the branch requested

Returns:

python dict translation of AntelopePf branch structure

Raise:

RunTime errors are possible from the ccore methods that are called.