mspasspy.global_history

manager

class mspasspy.global_history.manager.GlobalHistoryManager(database_instance, job_name, collection=None)[source]

Bases: object

A Global History Mananger handler.

This is a handler used in the mspass_client, normally user should not directly create a Global History Manager by his own. Instead, user should get the Global History Manager through mspass client’s methods.

get_alg_id(alg_name, parameters)[source]

Save the usage of the algorithm in the map/reduce operation

Parameters:
  • alg_name (str) – the name of the algorithm

  • alg_id (bson.objectid.ObjectId) – the UUID of the combination of algorithm_name and parameters

  • parameters (str) – the parameters of the algorithm

get_alg_list(job_name, job_id=None)[source]

Get a list of history records by job name(and job_id)

Parameters:
logging(alg_id, alg_name, parameters)[source]

Save the usage of the algorithm in the map/reduce operation

Parameters:
  • alg_id (bson.objectid.ObjectId) – the UUID of the combination of algorithm_name and parameters

  • alg_name (str) – the name of the algorithm

  • parameters (str) – the parameters of the algorithm

set_alg_name_and_parameters(alg_id, alg_name, parameters)[source]

Set the alg_name and parameters by a user specified alg_id

Parameters:
  • alg_id (bson.objectid.ObjectId) – the UUID of the combination of algorithm_name and parameters, used to find the records

  • alg_name (str) – the name of the algorithm user would like to set

  • parameters (str) – the parameters of the algorithm user would like to set

mspasspy.global_history.manager.mspass_dask_fold(self, func, *args, global_history=None, object_history=False, alg_id=None, alg_name=None, parameters=None, **kwargs)[source]

This decorator method add more functionaliy on the standard dask fold method and be a part of member functions in the dask bag library. Instead of performing the normal fold function, if user provides global history manager, alg_id(optional), alg_name(optional) and parameters(optional) as input, the global history manager will log down the usage of the algorithm. Also, if user set object_history to be True, then each mspass object in this fold function will save the object level history.

Parameters:
  • func – target function

  • global_history (GlobalHistoryManager) – a user specified global history manager

  • object_history – save the each object’s history in the fold when True

  • alg_id (str/bson.objectid.ObjectId) – a user specified alg_id for the fold operation

  • alg_name (str) – a user specified alg_name for the fold operation

  • parameters (str) – a user specified parameters for the fold operation

Returns:

a dask bag format of objects.

mspasspy.global_history.manager.mspass_dask_map(self, func, *args, global_history=None, object_history=False, alg_id=None, alg_name=None, parameters=None, **kwargs)[source]

This decorator method add more functionaliy on the standard dask map method and be a part of member functions in the dask bag library. Instead of performing the normal map function, if user provides global history manager, alg_id(optional), alg_name(optional) and parameters(optional) as input, the global history manager will log down the usage of the algorithm. Also, if user set object_history to be True, then each mspass object in this map function will save the object level history.

Parameters:
  • func – target function

  • global_history (GlobalHistoryManager) – a user specified global history manager

  • object_history – save the each object’s history in the map when True

  • alg_id (str/bson.objectid.ObjectId) – a user specified alg_id for the map operation

  • alg_name (str) – a user specified alg_name for the map operation

  • parameters (str) – a user specified parameters for the map operation

Returns:

a dask bag format of objects.

mspasspy.global_history.manager.mspass_map(data, func, global_history=None, object_history=False, alg_id=None, alg_name=None, parameters=None)[source]

This decorator method performs map function in Python. Instead of performing the normal map function, if user provides global history manager, alg_id(optional), alg_name(optional) and parameters(optional) as input, the global history manager will log down the usage of the algorithm. Also, if user set object_history to be True, then each mspass object in this map function will save the object level history.

Parameters:
  • data – a iterable which is to be mapped.

  • func – target function to which map passes each element of given iterable.

  • global_history (GlobalHistoryManager) – a user specified global history manager

  • object_history – save the each object’s history in the map when True

  • alg_id (str/bson.objectid.ObjectId) – a user specified alg_id for the map operation

  • alg_name (str) – a user specified alg_name for the map operation

  • parameters (str) – a user specified parameters for the map operation

Returns:

mapped objects.

mspasspy.global_history.manager.mspass_reduce(data, func, global_history=None, object_history=False, alg_id=None, alg_name=None, parameters=None)[source]

This method performs reduce function using functools. Instead of performing the normal reduce function, if user provides global history manager, alg_id(optional), alg_name(optional) and parameters(optional) as input, the global history manager will log down the usage of the algorithm. Also, if user set object_history to be True, then each mspass object in this reduce function will save the object level history.

Parameters:
  • data – data to be processed, it needs to be a iterable. Apply func of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value.

  • func – target function

  • global_history (GlobalHistoryManager) – a user specified global history manager

  • object_history – save the each object’s history in the reduce when True

  • alg_id (str/bson.objectid.ObjectId) – a user specified alg_id for the reduce operation

  • alg_name (str) – a user specified alg_name for the reduce operation

  • parameters (str) – a user specified parameters for the reduce operation

Returns:

reduced objects.

mspasspy.global_history.manager.mspass_spark_map(self, func, *args, global_history=None, object_history=False, alg_id=None, alg_name=None, parameters=None, **kwargs)[source]

This decorator method add more functionaliy on the standard spark map method and be a part of member functions in the spark RDD library. Instead of performing the normal map function, if user provides global history manager, alg_id(optional), alg_name(optional) and parameters(optional) as input, the global history manager will log down the usage of the algorithm. Also, if user set object_history to be True, then each mspass object in this map function will save the object level history.

Parameters:
  • func – target function

  • global_history (GlobalHistoryManager) – a user specified global history manager

  • object_history – save the each object’s history in the map when True

  • alg_id (str/bson.objectid.ObjectId) – a user specified alg_id for the map operation

  • alg_name (str) – a user specified alg_name for the map operation

  • parameters (str) – a user specified parameters for the map operation

Returns:

a spark RDD format of objects.

mspasspy.global_history.manager.mspass_spark_reduce(self, func, *args, global_history=None, object_history=False, alg_id=None, alg_name=None, parameters=None, **kwargs)[source]

This decorator method add more functionaliy on the standard spark reduce method and be a part of member functions in the spark RDD library. Instead of performing the normal reduce function, if user provides global history manager, alg_id(optional), alg_name(optional) and parameters(optional) as input, the global history manager will log down the usage of the algorithm. Also, if user set object_history to be True, then each mspass object in this reduce function will save the object level history.

Parameters:
  • func – target function

  • global_history (GlobalHistoryManager) – a user specified global history manager

  • object_history – save the each object’s history in the reduce when True

  • alg_id (str/bson.objectid.ObjectId) – a user specified alg_id for the reduce operation

  • alg_name (str) – a user specified alg_name for the reduce operation

  • parameters (str) – a user specified parameters for the reduce operation

Returns:

a spark RDD format of objects.

ParameterGTree

class mspasspy.global_history.ParameterGTree.ParameterGTree(doc=None)[source]

Bases: OrderedDict

Base class for family of objects used to hold an abstraction of a set of control parameters for a data processing function. The base class abstracts the concept of storing such data in an g-tree structure. In the documentation here the tree should be pictures as upright in the biological tree analog. i.e. up mean higher levels in the tree and down means dropping to lower levels in the tree.

This class is inherited from collections.OrderedDict, so users can get access to the data using common index operation, for example: gTree[‘phases’][‘travel_time_calculator’][‘taup’][‘model_name’] In addition, user can operate adding, deleting, updating, creating in the same way as the operations on OrderedDict.

Each node of the g-tree may have leaves and/or a set of branches. Most simple algorithms need only one node with leaves made up of name-value pairs. More complex algorithms often need the tree structure to describe more complicated data control structures.

asdict()[source]

Return the dictionary representation of the ParameterGTree instance. This function will first update the internal dictionary control doc. Its return can be a build-in dict or a collections.OrderedDict, according to the input when building this GTree.

Since ParameterGTree is inheritted from OrderedDict, an instance can be directed transfered into a dict/OrderedDict without calling this function.

get(key, seperator='.')[source]

Fetch a value defined by key. For leaves at the root node the key can be a simple string. For a leaf attached at a higher level node we specify a chain of one or more branch names with keys between the specified seperator. Examples (all used default value of seperator): 1. If we had a leaf node with the key ‘name’ under the branch name

‘phases’ we use the compound key ‘phases.name’. Such a tag could, for example, contain ‘P’ for the seismic to define this set of parameters as related to the P phase.

  1. Suppose the phases branch was linked to higher level branch with the key ‘travel_time_calculator’ that had a leaf parameter ‘taup’ that is itself a branch name with terminal leaf keys under it. A real life example might be ‘model_name’. We would refer to that leaf with the string ‘phases.travel_time_calculator.taup.model_name’. That key might, for example, have he value ‘iasp91’ which could be passed to obspy’s taup calculator.

This method should also support key defined as a python list. The list would need to be a set of keys that would define the path to climb the tree to fetch the desired leaf. For this form the two examples above would be represented as follows: 1. [‘phases’,’name’] 2. [‘phases’,’travel_time_calculator’,’taup’,’model_name’]

Users can also use the build-in index operation to access the children elements, which is more natural. for example: 1. [‘phases’][‘name’] 2. [‘phases’][‘travel_time_calculator’][‘taup’][‘model_name’]

get_branch(key)[source]

Extract the contents of a named branch. Returns a copy of the tree with the associated key from the branch name upward. The tree returned will have the root of the tree set as current.

get_branch_keys()[source]

Return the keys for all branches from this level. Branches are keyed with a keyword string like leaves. Branches can be extracted with prune or we can walk the tree with a set of methods defined below.

get_leaf(key)[source]

Returns a copy of the key-value pair defined by key. This function only search for the key in this layer, and won’t return value stored in higher levels. To search in the entire tree, use “get”.

get_leaf_keys()[source]

Return the keys for all key-value pairs that are leaves at the current level of the parameter tree. For branches this method can be used to extract leaves at the current level. Return is a dict of key-value pairs that we are calling the leaves of the tree.

prune(key)[source]

Remove a branch or leaf defined by key from self. Return a copy of the branch/leaf pruned in the process (like get_branch/get_leaf but self is altered)

put(key, value, separator='.')[source]

putter with same behavior for compound keys defined for get method. A put would create a new branch it implies if that branch is not already present.

Same as the setter function, users can also use index to put new data in GTree here. Please note that when put data using indexes, new branches won’t be created automatically, and users should add the intermediate branches themselves.

update_control()[source]

Update the control doc according to the children in this level. As the hierarchy data may change in deeper level, so we need to check every sub tree. It is implemented by recursively updating the control doc of a tree.

mspasspy.global_history.ParameterGTree.parameter_to_GTree(*args, parameters_str=None, **kwargs)[source]

A helper function to parse parameters and build a GTree accordingly. This function would be used in GlobalHistoryManager to help record the parameters.

Parameters:
  • args – Non-keyworded arguments

  • kwargs – Keyworded arguments

  • parameter_str – a parameter string defined by user

Returns:

An OrderedDict of parameters and arguments.

mspasspy.global_history.ParameterGTree.params_to_parameters_dict(*args, **kwargs)[source]

Capture a function’s parameters, return a dict that stores parameters and arguments. Filepath arguments will be parsed into python object, and then turned into a dict. Now we support pf files and yaml files.

Parameters:
  • args – Non-keyworded arguments

  • kwargs – Keyworded arguments

Returns:

An OrderedDict of parameters and arguments.

mspasspy.global_history.ParameterGTree.parse_filepath_in_parameters(parameters_dict)[source]

Parse the filepath parameters in a function’s parameters dict, Filepath arguments will be parsed into python object, and then turned into a dict. Currently we support pf files and yaml files.

Parameters:

parameters_dict – parameter dict of a function

Returns:

An OrderedDict of parameters and arguments.

mspasspy.global_history.ParameterGTree.str_to_parameters_dict(parameter_str)[source]

Parse the parameter string defined by user into an ordered dict. The input str should be in the format like “a, b, c=d, e=f, …”

Parameters:

parameter_str – a parameter string defined by user

Returns:

An OrderedDict of parameters and arguments.