datacatalog.formats package

Submodules

datacatalog.formats.classify module

datacatalog.formats.classify.FORMATS = ['Transcriptic', 'Ginkgo', 'Biofab', 'SampleAttributes', 'Caltech', 'Marshall', 'Duke_Haase', 'Duke_Validation', 'Tulane']

Class names for document types that can be converted to Data Catalog records

exception datacatalog.formats.classify.NoClassifierError[source]

Bases: datacatalog.formats.converter.ConversionError

Unable to classify a document, preventing its conversion

datacatalog.formats.classify.get_converter(json_filepath, options={}, expect=None)[source]
datacatalog.formats.classify.get_converters(options={})[source]

Discover and return Converters

Returns:One or more Converter objects
Return type:list

datacatalog.formats.common module

datacatalog.formats.converter module

exception datacatalog.formats.converter.ConversionError[source]

Bases: Exception

Something happened that prevented conversion of the target document

class datacatalog.formats.converter.Converter(schemas=[], targetschema=None, options={}, reactor=None)[source]

Bases: object

Base class implementing a document converter

FILENAME = 'baseclass'
PROJECT = 'SD2E-Community'
TENANT = 'sd2e'
VERSION = '0.0.0'
convert(input_fp, output_fp=None, verbose=True, config={}, enforce_validation=True)[source]

Convert between formats

This is a pass-through method that invokes a runner script

Parameters:
  • input_fp (str) – Path to input file
  • output_fp (str) – Path to output file
  • verbose (bool, optional) – Print verbose output while running
  • config (dict, optional) – Generic configuration object
  • enforce_validation (bool, optional) – Whether to force validation of outputs
Returns:

Whether the conversion succeeeded

Return type:

bool

get_classifier_schema()[source]

Get the JSON schema that Converter is using for classification

Raises:ConversionError – Returned on all Exceptions
Returns:JSON schema in dictionary form
Return type:dict
get_schema()[source]

Pass-through to get_classifier_schema()

projects = <datacatalog.tenancy.projects.Projects object>
test(input_fp, output_fp, verbose=True, config={})[source]

Smoketest method to see if Converter discovery is working

Returns:True
validate(output_fp, permissive=False)[source]

Validate a file against schemas known to Converter

Parameters:output_fp (str) – path to the validation target file

Note

Yes, this is redundant with validate_input()

Parameters:permissive (bool) – whether to return False on failure to validate
Raises:ValidationError – Raised when validation fails
Returns:True on success
Return type:boolean
validate_input(input_fp, encoding, permissive=False)[source]

Validate a generic input file against schemas known to Converter

Parameters:
  • input_fp (str) – path to the validation target file
  • permissive (bool) – whether to return False on failure to validate
Raises:
  • ConversionError – Raised when schema or target can’t be loaded
  • ValidationError – Raised when validation fails
Returns:

True on success

Return type:

boolean

class datacatalog.formats.converter.formatChecker[source]

Bases: jsonschema._format.FormatChecker

A simple JSON format validator

datacatalog.formats.runner module

datacatalog.formats.runner.convert_file(target_schema, input_path, output_path=None, verbose=False, config={}, enforce_validation=True)[source]

Implements a simple converter that copies input_path to output_path

datacatalog.formats.schemas module

datacatalog.formats.schemas.get_schemas()[source]

Get JSON schemas for Classifiers

Returns:Object and document JSON schema that define the store
Return type:JSONSchemaCollection