birl.utilities.evaluate module¶

Evaluate experiments

birl.utilities.evaluate.aggregate_user_score_timeline(df, col_aggreg, col_user, col_score, lower_better=True, top_down=True, interp=False)[source]¶

compute some cumulative statistic over given table, assuming col_aggreg is continues first it is grouped by col_aggreg and chose min/max (according to lower_better) of col_score assuming that col_aggreg is sortable like a timeline do propagation of min/max from past values depending on top_down (which reverse the order)

Parameters

df – rich table containing col_aggreg, col_user, col_score
col_aggreg (str) – used for grouping assuming to be like a timeline
col_user (str) – by this column the scores are assumed to be independent
col_score (str) – the scoring value for selecting the best
lower_better (bool) – taking min/max of scoring value
top_down (bool) – reversing the order according to col_aggreg
interp (bool) – in case some scores for col_aggreg are missing, interpolate from past

Return DF

table

>>> np.random.seed(0)
>>> df = pd.DataFrame()
>>> df['day'] = np.random.randint(0, 5, 50)
>>> df['user'] = np.array(list('abc'))[np.random.randint(0, 3, 50)]
>>> df['score'] = np.random.random(50)
>>> df_agg = aggregate_user_score_timeline(df, 'day', 'user', 'score')
>>> df_agg.round(3)  
       b      c      a
4  0.447  0.132  0.567
0  0.223  0.005  0.094
3  0.119  0.005  0.094
1  0.119  0.005  0.094
2  0.119  0.005  0.020

birl.utilities.evaluate.compute_affine_transf_diff(points_ref, points_init, points_est)[source]¶

compute differences between initial state and estimated results

Parameters

points_ref (ndarray) – np.array<nb_points, dim>
points_init (ndarray) – np.array<nb_points, dim>
points_est (ndarray) – np.array<nb_points, dim>

Return ndarray

list of errors

>>> points_ref = np.array([[1, 2], [3, 4], [2, 1]])
>>> points_init = np.array([[3, 4], [1, 2], [2, 1]])
>>> points_est = np.array([[3, 4], [2, 1], [1, 2]])
>>> diff = compute_affine_transf_diff(points_ref, points_init, points_est)
>>> import pandas as pd
>>> pd.Series(diff).sort_index()  
Affine rotation Diff        -8.97...
Affine scale X Diff         -0.08...
Affine scale Y Diff         -0.20...
Affine shear Diff           -1.09...
Affine translation X Diff   -1.25...
Affine translation Y Diff    1.25...
dtype: float64

>>> # Wrong input:
>>> compute_affine_transf_diff(None, np.array([[1, 2], [3, 4], [2, 1]]), None)
{}

birl.utilities.evaluate.compute_matrix_user_ranking(df_stat, higher_better=False)[source]¶

compute ranking matrix over features in columns sorting per column and unique colour per user

Parameters

df_stat (DF) – table where index are users and columns are scoring
higher_better (bool) – ranking such that larger value is better

Return ndarray

ranking with features in columns

>>> np.random.seed(0)
>>> df = pd.DataFrame(np.random.random((5, 3)), columns=list('abc'))
>>> compute_matrix_user_ranking(df)  
array([[ 3.,  1.,  4.],
       [ 2.,  0.,  3.],
       [ 1.,  3.,  0.],
       [ 0.,  2.,  1.],
       [ 4.,  4.,  2.]])

birl.utilities.evaluate.compute_ranking(user_cases, field, reverse=False)[source]¶

compute ranking over selected field

Parameters

user_cases (dict(dict)) – dictionary with measures for user and case
field (str) – name of field to be ranked
reverse (bool) – use reverse ordering

Return dict(dict(dict))

extended dictionary

>>> user_cases = {
...     'karel': {1: {'rTRE': 0.04}, 2: {'rTRE': 0.25}, 3: {'rTRE': 0.1}},
...     'pepa': {1: {'rTRE': 0.33}, 3: {'rTRE': 0.05}},
...     'franta': {2: {'rTRE': 0.01}, 3: {'rTRE': 0.15}}
... }
>>> user_cases = compute_ranking(user_cases, 'rTRE')
>>> import pandas as pd
>>> df = pd.DataFrame({usr: {cs: user_cases[usr][cs]['rTRE_rank']
...                          for cs in user_cases[usr]}
...                    for usr in user_cases})[sorted(user_cases.keys())]
>>> df  
   franta  karel  pepa
1       3      1     2
2       1      2     3
3       3      2     1

birl.utilities.evaluate.compute_target_regist_error_statistic(points_ref, points_est)[source]¶

compute distance as between related points in two sets and make a statistic on those distances - mean, std, median, min, max

Parameters

points_ref (ndarray) – final landmarks in target image of np.array<nb_points, dim>
points_est (ndarray) – warped landmarks from source to target of np.array<nb_points, dim>

Return tuple(ndarray,dict)

(np.array<nb_points, 1>, dict)

>>> points_ref = np.array([[1, 2], [3, 4], [2, 1]])
>>> points_est = np.array([[3, 4], [2, 1], [1, 2]])
>>> dist, stat = compute_target_regist_error_statistic(points_ref, points_ref)
>>> dist
array([ 0.,  0.,  0.])
>>> all(stat[k] == 0 for k in stat if k not in ['overlap points'])
True
>>> dist, stat = compute_target_regist_error_statistic(points_ref, points_est)
>>> dist  
array([ 2.828...,  3.162...,  1.414...])
>>> import pandas as pd
>>> pd.Series(stat).sort_index()  
Max               3.16...
Mean              2.46...
Mean_weighted     2.52...
Median            2.82...
Min               1.41...
STD               0.75...
overlap points    1.00...
dtype: float64

>>> # Wrong input:
>>> compute_target_regist_error_statistic(None, np.array([[1, 2], [3, 4], [2, 1]]))
([], {'overlap points': 0})

birl.utilities.evaluate.compute_tre(points_1, points_2)[source]¶

computing Target Registration Error for each landmark pair

Parameters

points_1 (ndarray) – set of points
points_2 (ndarray) – set of points

Return ndarray

list of errors of size min nb of points

>>> np.random.seed(0)
>>> compute_tre(np.random.random((6, 2)),
...             np.random.random((9, 2)))  
array([ 0.21...,  0.70...,  0.44...,  0.34...,  0.41...,  0.41...])

birl.utilities.evaluate.compute_tre_robustness(points_target, points_init, points_warp)[source]¶

compute robustness as improvement for each TRE

Parameters

points_target (ndarray) – final landmarks in target image
points_init (ndarray) – initial landmarks in source image
points_warp (ndarray) – warped landmarks from source to target

Return bool

improvement

>>> np.random.seed(0)
>>> compute_tre_robustness(np.random.random((10, 2)),
...                        np.random.random((9, 2)),
...                        np.random.random((8, 2)))
0.375
>>> compute_tre_robustness(np.random.random((10, 2)),
...                        np.random.random((9, 2)) + 5,
...                        np.random.random((8, 2)) + 2)
1.0

birl.utilities.evaluate.grouping_cumulative(df, col_index, col_column)[source]¶

compute histogram statistic over selected column and in addition group this histograms

Parameters

df (DataFrame) – rich table
col_index (str) – column which will be used s index in resulting table
col_column (str) – column used for computing a histogram

Return DF

>>> np.random.seed(0)
>>> df = pd.DataFrame()
>>> df['result'] = np.random.randint(0, 2, 50)
>>> df['user'] = np.array(list('abc'))[np.random.randint(0, 3, 50)]
>>> grouping_cumulative(df, 'user', 'result').astype(int)  
       0   1
user
a     10  12
b      4   9
c      6   9