birl.utilities.experiments module¶

General experiments methods.

class birl.utilities.experiments.Experiment(exp_params, stamp_unique=True)[source]¶

Bases: object

Tha basic template for experiment running with specific initialisation. None, all required parameters used in future have to come in init phase.

The workflow is following:

prepare experiment folder, copy configurations
._prepare() prepares experiment according its specification
._load_data() loads required input data and annotations
._run() performs the experimental body, run the method on input data
._summarise() evaluates result against annotation and summarize
terminate the experimnt

Particular specifics:

each experiment is created in own/folder (if timestamp required)
at the beginning experiment configs are copied to the folder
logging: INFO level is used for console and DEBUG to file
if several sources can be processed independently, you may parallelize it

Example

>>> import birl.utilities.data_io as tl_io
>>> path_out = tl_io.create_folder('output')
>>> params = {'path_out': path_out, 'name': 'my_Experiment'}
>>> expt = Experiment(params, False)
>>> 'path_exp' in expt.params
True
>>> expt.run()
True
>>> del expt
>>> import shutil
>>> shutil.rmtree(path_out, ignore_errors=True)

Initialise the experiment, create experiment folder and set logger.

Parameters

exp_params (dict) – experiment configuration {str: value}
stamp_unique (bool) – add at the end of experiment folder unique time stamp (actual date and time)

__check_exist_path()[source]¶

Check existence of all paths in parameters.

check existence of all parameters dictionary which has contains words: ‘path’, ‘dir’, ‘file’

__create_folder(stamp_unique=True)[source]¶

Create the experiment folder (iterate if necessary).

create unique folder if timestamp is requested
export experiment configuration to the folder

_check_required_params()[source]¶: Check some extra required parameters for this experiment.

classmethod _evaluate()[source]¶: Evaluate experiment - prediction & annotation.

classmethod _load_data()[source]¶: Loading data - source and annotations.

classmethod _prepare()[source]¶: Prepare the experiment folder.

classmethod _run()[source]¶: Perform experiment itself with given method and source data.

classmethod _summarise()[source]¶: Summarise experiment result against annotation.

run()[source]¶

Running the complete experiment.

This ain method consist of following steps:

_prepare() prepares experiment, some extra procedures if needed
_load_data() loads required input data (and annotations)
_run() performs the experimented method on input data
_summarise() evaluates result against annotation and summarize

Note

all the particular procedures are empty and has to be completed according to specification of the experiment (do some extra preparation like copy extra configs, define how to load the data, perform custom method, summarise results with ground truth / annotation)

NAME_CONFIG_YAML = 'config.yml'[source]¶: default output file for exporting experiment configuration

NAME_RESULTS_CSV = 'results.csv'[source]¶: default file for exporting results in table format

NAME_RESULTS_TXT = 'results.txt'[source]¶: default file for exporting results in formatted text format

REQUIRED_PARAMS = ['path_out'][source]¶: required experiment parameters

birl.utilities.experiments._get_ram()[source]¶

get the RAM of the computer

Return int: RAM value in GB

birl.utilities.experiments.computer_info()[source]¶

cet basic computer information.

Return dict

>>> len(computer_info())
9

birl.utilities.experiments.create_basic_parser(name='')[source]¶

create the basic arg parses

Parameters: name (str) – name of the methods
Return object

>>> parser = create_basic_parser()
>>> type(parser)
<class 'argparse.ArgumentParser'>
>>> parse_arg_params(parser)  

birl.utilities.experiments.create_experiment_folder(path_out, dir_name, name='', stamp_unique=True)[source]¶

create the experiment folder and iterate while there is no available

Parameters

path_out (str) – path to the base experiment directory
name (str) – special experiment name
dir_name (str) – special folder name
stamp_unique (bool) – whether add at the end of new folder unique tag

>>> p_dir = create_experiment_folder('.', 'my_test', stamp_unique=False)
>>> os.rmdir(p_dir)
>>> p_dir = create_experiment_folder('.', 'my_test', stamp_unique=True)
>>> p_dir  
'...my_test_...-...'
>>> os.rmdir(p_dir)

birl.utilities.experiments.dict_deep_update(dict_base, dict_update)[source]¶

update recursively

Parameters

dict_base (dict) –
dict_update (dict) –

Return dict

>>> d = {'level1': {'level2': {'levelA': 0, 'levelB': 1}}}
>>> u = {'level1': {'level2': {'levelB': 10}}}
>>> import json
>>> d = json.dumps(dict_deep_update(d, u), sort_keys=True, indent=2)
>>> print(d)  
{
  "level1": {
    "level2": {
      "levelA": 0,
      "levelB": 10
    }
  }
}

birl.utilities.experiments.exec_commands(commands, path_logger=None, timeout=None)[source]¶

run the given commands in system Command Line

See refs:

Note

Timeout in check_output is not supported by Python 2.x

Parameters

commands (list(str)) – commands to be executed
path_logger (str) – path to the logger
timeout (int) – timeout for max commands length

Return bool

whether the commands passed

>>> exec_commands(('ls', 'ls -l'), path_logger='./sample-output.log')
True
>>> exec_commands('mv sample-output.log moved-output.log', timeout=10)
True
>>> os.remove('./moved-output.log')
>>> exec_commands('cp sample-output.log moved-output.log')
False

birl.utilities.experiments.get_nb_workers(ratio)[source]¶: Given usage ratio return nb of cpu to use.

birl.utilities.experiments.is_iterable(var, iterable_types=(<class 'list'>, <class 'tuple'>, <class 'generator'>))[source]¶

check if the variable is iterable

Parameters: var – tested variable
Return bool: iterable

>>> is_iterable('abc')
False
>>> is_iterable([0])
True
>>> is_iterable((1, ))
True

birl.utilities.experiments.iterate_mproc_map(wrap_func, iterate_vals, nb_workers=2, desc='', ordered=True)[source]¶

create a multi-porocessing pool and execute a wrapped function in separate process

Parameters

wrap_func (func) – function which will be excited in the iterations
iterate_vals (list) – list or iterator which will ide in iterations, if -1 then use all available threads
nb_workers (int) – number og jobs running in parallel
desc (str|None) – description for the bar, if it is set None, bar is suppressed
ordered (bool) – whether enforce ordering in the parallelism

Waiting reply on:

See:

https://sebastianraschka.com/Articles/2014_multiprocessing.html
https://github.com/nipy/nipype/pull/2754
https://medium.com/contentsquare-engineering-blog/multithreading-vs-multiprocessing-in-python-ece023ad55a
http://mindcache.me/2015/08/09/
python-multiprocessing-module-daemonic-processes-are-not-allowed-to-have-children.html
https://medium.com/@bfortuner/python-multithreading-vs-multiprocessing-73072ce5600b

>>> list(iterate_mproc_map(np.sqrt, range(5), nb_workers=1, desc=None))  
[0.0, 1.0, 1.41..., 1.73..., 2.0]
>>> list(iterate_mproc_map(sum, [[0, 1]] * 5, nb_workers=2, ordered=False))
[1, 1, 1, 1, 1]
>>> list(iterate_mproc_map(max, [(2, 1)] * 5, nb_workers=2, desc=''))
[2, 2, 2, 2, 2]

birl.utilities.experiments.parse_arg_params(parser, upper_dirs=None)[source]¶

parse all params

Parameters

parser – object of parser
upper_dirs (list(str)) – list of keys in parameters with item for which only the parent folder must exist

Return dict

parameters

birl.utilities.experiments.release_logger_files()[source]¶

close all handlers to a file

>>> release_logger_files()
>>> len([1 for lh in logging.getLogger().handlers
...      if type(lh) is logging.FileHandler])
0

birl.utilities.experiments.set_experiment_logger(path_out, file_name='logging.txt', reset=True)[source]¶

set the logger to file

Parameters

path_out (str) – path to the output folder
file_name (str) – log file name
reset (bool) – reset all previous logging into a file

>>> set_experiment_logger('.')
>>> len([1 for lh in logging.getLogger().handlers
...      if type(lh) is logging.FileHandler])
1
>>> release_logger_files()
>>> os.remove(FILE_LOGS)

birl.utilities.experiments.string_dict(ds, headline='DICTIONARY:', offset=25)[source]¶

format the dictionary into a string

Parameters

ds (dict) – {str: val} dictionary with parameters
headline (str) – headline before the printed dictionary
offset (int) – max size of the string name

Return str

formatted string

>>> string_dict({'a': 1, 'b': 2}, 'TEST:', 5)
'TEST:\n"a":  1\n"b":  2'

birl.utilities.experiments.try_decorator(func)[source]¶

costume decorator to wrap function in try/except

Parameters: func – decorated function
Return func: output of the decor. function

birl.utilities.experiments.update_paths(args, upper_dirs=None, pattern='path')[source]¶

find params with not existing paths

Parameters

args (dict) – dictionary with all parameters
upper_dirs (list(str)) – list of keys in parameters with item for which only the parent folder must exist
pattern (str) – patter specifying key with path

Return list(str)

key of missing paths

>>> update_paths({'sample': 123})[1]
[]
>>> update_paths({'path_': '.'})[1]
[]
>>> params = {'path_out': './nothing'}
>>> update_paths(params)[1]
['path_out']
>>> update_paths(params, upper_dirs=['path_out'])[1]
[]

birl.utilities.experiments.CPU_COUNT = 2[source]¶: number of available CPUs on this computer

birl.utilities.experiments.FILE_LOGS = 'logging.txt'[source]¶: default logging tile

birl.utilities.experiments.FORMAT_DATE_TIME = '%Y%m%d-%H%M%S'[source]¶: default date-time format

birl.utilities.experiments.ITERABLE_TYPES = (<class 'list'>, <class 'tuple'>, <class 'generator'>)[source]¶: define all types to be assume list like

birl.utilities.experiments.LOG_FILE_FORMAT = <logging.Formatter object>[source]¶: default logging template - date-time for logging to file

birl.utilities.experiments.STR_LOG_FORMAT = '%(asctime)s:%(levelname)s@%(filename)s:%(processName)s - %(message)s'[source]¶: default logging template - log location/source for logging to file