datacatalog.managers package

Submodules

datacatalog.managers.common module

class datacatalog.managers.common.ManagerBase(*args, **kwargs)[source]

Bases: object

Base class for non database, non Agave functions

classmethod get_uuidtype(uuid)[source]

Identify the named type for a given UUID

Parameters:uuid (str) – UUID to classify by type
Returns:Named type of the UUID
Return type:str
classmethod listify_uuid(uuid, validate_members=True)[source]

Wraps typeduuid.listify_uuid()

classmethod sanitize(mongo_document)[source]

Strips out non-public fields from a JSON document

class datacatalog.managers.common.Manager(mongodb, agave=None, *args, **kwargs)[source]

Bases: datacatalog.managers.common.ManagerBase

Manages operations across LinkedStores

RESOLVE_ORDER = ('file', 'reference', 'pipelinejob', 'pipeline', 'sample', 'measurement', 'experiment', 'experiment_design', 'challenge_problem', 'structured_request', 'process', 'fixity', 'association', 'tag_annotation', 'text_annotation')
RESOLVE_RE = re.compile('^(file|reference|pipelinejob|pipeline|sample|measurement|experiment|experiment_design|challenge_problem|structured_request|process|fixity|association|tag_annotation|text_annotation).')
current_tapis_user(permissive=False)[source]

Learns the current TACC username

derivation_from_inputs(inputs=[])[source]

Retrieve derived_from linkages for a set of inputs

Parameters:inputs (list) – Identifier values for one or more inputs (e.g. name, file_id, uri)
Returns:a set of Typed UUIDs
Return type:list
designs_from_challenges(ids, permissive=True)[source]
experiments_from_designs(ids, permissive=True)[source]
experiments_from_experiments(ids, permissive=True)[source]
files_from_designs(ids, permissive=True)[source]
files_from_experiments(ids, permissive=True)[source]
files_from_measurements(ids, permissive=True)[source]
files_from_samples(ids, permissive=True)[source]
generator_from_inputs(inputs=[])[source]

Retrieve generated_by linkages for a set of inputs

Parameters:inputs (list) – Identifier values for one or more inputs (e.g. name, file_id, uri)
Returns:a set of Typed UUIDs
Return type:list
get_by_identifier(identifier_string, permissive=True)[source]

Search LinkedStores for a string identifier

Parameters:identifier_string (str) – An identifier string
Returns:The document that was retrieved
Return type:dict
get_by_uuid(uuid, permissive=True)[source]

Returns a LinkedStore document by UUID

Parameters:uuid (str) – UUID of the document to retrieve
Returns:The document that was retrieved
Return type:dict
get_by_uuids(uuids, permissive=True)[source]

Returns a list of LinkedStore documents by UUID

Parameters:uuids (list) – List of document UUIDs
Returns:list The document that was retrieved

User-friendly method to get UUIDs linked to an identifier

Parameters:
  • identifier (str) – Identifier string for record to be modified
  • linkage_name (str) – A valid Linkage
Raises:
  • ValueError – Invalid or unknown identifers are encountered
  • ManagerError – Failed to fetch and return linkages
Returns:

A list of TypedUUIDs for the requested linkage

Return type:

list

get_tapis_user(username=None, permissive=False)[source]

Retrieve a username record from the Tapis profile service

get_uuid_from_identifier(identifier)[source]

Resolve an identifier into its corresponding UUID

Parameters:identifier (str) – A known distinct identifier from any colllection
Returns:The string UUID for identifier uuid_type: The TypedUUID type for uuid
Return type:uuid
classmethod init_stores(mongodb, agave=None)[source]
jobs_from_any(ids, permissive=True)[source]
kids_from_parents(ids, parent='experiment', parent_id='experiment_id', kid='sample', kid_id='uuid', permissive=False)[source]
level_from_lineage(lineage, level='experiment', permissive=False)[source]

Traverse a lineage and return value for a specific level

lineage_from_uuid(query_uuid, target='child_of', permissive=True)[source]

Get self-inclusive lineage for a given UUID

Parameters:
  • query_uuid (str) – The UUID to query on
  • target (str, optional) – The kind of linkage to follow
  • permissive (bool, optional) – Whether to raise a ValueError if a
  • lineage can't be determined. (simple) –
Raises:

ValueError – Raised if permissive==False and complete lineage traversal cannot be achieved

Returns:

Ordered list of tuples (<collection_level>, <uuid5>)

Return type:

list

Note

The lineage will include a reference to the original query at position 0. Access the UUID of immediate parent as follows: my_lineage[1][1].

User-friendly method to link two Data Catalog documents

Parameters:
  • identifier (str) – Identifier string for record to be modified
  • linked_identifier (str) – Identifier string for record to be linked
  • linkage_name (str) – A valid linkage name
  • token – String token authorizing edits to target record
Raises:

ValueError – Raised when invalid or unknown identifers are encountered

Returns:

The modified record, including its new linkage

Return type:

dict

linkage_from_inputs(inputs=[], target='child_of')[source]

Retrieve linkage targets from a set of inputs

Filepaths will be resolved against the file collection and will return a reference to their immediate parent. URIs will be resolved against the reference collection and will return a reference to themselves.

Parameters:
  • inputs (list) – Identifier values for one or more inputs (e.g. name, file_id, uri)
  • target (str) – Linkage type to retrieve
Returns:

a set of Typed UUIDs

Return type:

list

measurements_from_challenges(ids, permissive=True)[source]
measurements_from_designs(ids, permissive=True)[source]
measurements_from_experiments(ids, permissive=True)[source]
measurements_from_measurements(ids, permissive=True)[source]
measurements_from_samples(ids, permissive=True)[source]
parent_from_inputs(inputs=[])[source]

Retrieve child_of linkages for a set of inputs

Parameters:inputs (list) – Identifier values for one or more inputs (e.g. name, file_id, uri)
Returns:a set of Typed UUIDs
Return type:list
samples_from_experiments(ids, permissive=True)[source]
samples_from_samples(ids, permissive=True)[source]
self_from_ids(ids, enforce_type=True, permissive=False)[source]

Resolve UUIDs from one or more identifiers

Parameters:
  • ids (str/list) – String or list of string identifiers
  • enforce_type (bool, optional) – Whether all identifiers must be of same type
  • permissive (bool, optional) – Whether to return None or raise exception when encountering an error
Raises:

ValueError – Raised when identifiers can’t be resolved or type enforcement fails

Returns:

One or UUID strings

Return type:

list

self_from_inputs(inputs=[])[source]

Retrieve canonical linkage UUIDs for a list of inputs

Parameters:inputs (list) – Identifier values for one or more inputs (e.g. name, file_id, uri)
Returns:a set of Typed UUIDs
Return type:list
validate_tapis_username(username=None, permissive=False)[source]

Verify the provided username against the Tapis profile service

validate_uuid(uuid, permissive=False)[source]

Verify that a UUID5 exists in the system by retrieving it

exception datacatalog.managers.common.ManagerError[source]

Bases: datacatalog.linkedstores.basestore.exceptions.CatalogError

Error has occurred inside a Manager