birl.utilities.dataset module¶
Some functionality related to dataset
Copyright (C) 2016-2019 Jiri Borovec <jiri.borovec@fel.cvut.cz>
-
birl.utilities.dataset.
args_expand_images
(parser, nb_workers=1, overwrite=True)[source]¶ - expand the parser by standard parameters related to images:
- image paths
- allow overwrite (optional)
- number of jobs
Parameters: Return obj: >>> import argparse >>> args_expand_images(argparse.ArgumentParser()) # doctest: +ELLIPSIS ArgumentParser(...)
-
birl.utilities.dataset.
args_expand_parse_images
(parser, nb_workers=1, overwrite=True)[source]¶ - expand the parser by standard parameters related to images:
- image paths
- allow overwrite (optional)
- number of jobs
Parameters: Return dict:
-
birl.utilities.dataset.
common_landmarks
(points1, points2, threshold=1.5)[source]¶ find common landmarks in two sets
Parameters: Return list(bool): flags
>>> np.random.seed(0) >>> common = np.random.random((5, 2)) >>> pts1 = np.vstack([common, np.random.random((10, 2))]) >>> pts2 = np.vstack([common, np.random.random((15, 2))]) >>> common_landmarks(pts1, pts2, threshold=1e-3) array([[0, 0], [1, 1], [2, 2], [3, 3], [4, 4]]) >>> np.random.shuffle(pts2) >>> common_landmarks(pts1, pts2, threshold=1e-3) array([[ 0, 13], [ 1, 10], [ 2, 9], [ 3, 14], [ 4, 8]])
-
birl.utilities.dataset.
compute_bounding_polygon
(landmarks)[source]¶ get the polygon where all point lies inside
Parameters: landmarks (ndarray) – set of points Return ndarray: pints of polygon >>> np.random.seed(0) >>> points = np.random.randint(1, 9, (45, 2)) >>> compute_bounding_polygon(points) # doctest: +NORMALIZE_WHITESPACE [[1, 2], [2, 4], [1, 5], [2, 8], [7, 8], [8, 7], [8, 1], [3, 1], [3, 2]]
-
birl.utilities.dataset.
compute_convex_hull
(landmarks)[source]¶ compute convex hull around landmarks
- http://lagrange.univ-lyon1.fr/docs/scipy/0.17.1/generated/scipy.spatial.ConvexHull.html
- https://stackoverflow.com/questions/21727199
Parameters: landmarks (ndarray) – set of points Return ndarray: pints of polygon >>> np.random.seed(0) >>> pts = np.random.randint(15, 30, (10, 2)) >>> compute_convex_hull(pts) array([[27, 20], [27, 25], [22, 24], [16, 21], [15, 18], [26, 18]])
-
birl.utilities.dataset.
compute_half_polygon
(landmarks, idx_start=0, idx_end=-1)[source]¶ compute half polygon path
Parameters: Return ndarray: set of points
>>> pts = [(-1, 1), (0, 0), (0, 2), (1, 1), (1, -0.5), (2, 0)] >>> compute_half_polygon(pts, idx_start=0, idx_end=-1) [[-1.0, 1.0], [0.0, 2.0], [1.0, 1.0], [2.0, 0.0]] >>> compute_half_polygon(pts[:2], idx_start=-1, idx_end=0) [[-1, 1], [0, 0]] >>> pts = [[0, 2], [1, 5], [2, 4], [2, 5], [4, 4], [4, 6], [4, 8], [5, 8], [5, 8]] >>> compute_half_polygon(pts) [[0, 2], [1, 5], [2, 5], [4, 6], [4, 8], [5, 8]]
-
birl.utilities.dataset.
convert_landmarks_from_itk
(lnds, image_size)[source]¶ converting ITK format to used in ImageJ
Parameters: Return ndarray: landmarks
>>> convert_landmarks_from_itk([[ 20, 145], [150, 50], [100, 150]], (150, 200)) array([[ 5, 20], [100, 150], [ 0, 100]]) >>> lnds = [[ 20, 145], [150, 50], [100, 150], [0, 0], [150, 200]] >>> img_size = (150, 200) >>> lnds2 = convert_landmarks_from_itk(convert_landmarks_to_itk(lnds, img_size), img_size) >>> np.array_equal(lnds, lnds2) True
-
birl.utilities.dataset.
convert_landmarks_to_itk
(lnds, image_size)[source]¶ converting used landmarks to ITK format
Parameters: Return ndarray: landmarks
>>> convert_landmarks_to_itk([[5, 20], [100, 150], [0, 100]], (150, 200)) array([[ 20, 145], [150, 50], [100, 150]])
-
birl.utilities.dataset.
detect_binary_blocks
(vec_bin)[source]¶ detect the binary object by beginning, end and length in !d signal
Parameters: vec_bin (list(bool)) – binary vector with 1 for an object Return tuple(list(int),list(int),list(int)): >>> vec = np.array([1] * 15 + [0] * 5 + [1] * 20) >>> detect_binary_blocks(vec) ([0, 20], [15, 39], [14, 19])
-
birl.utilities.dataset.
estimate_scaling
(images, max_size=5000)[source]¶ find scaling for given set of images and maximal image size
Parameters: Return float: scaling in range (0, 1)
>>> estimate_scaling([np.zeros((12000, 300, 3))]) # doctest: +ELLIPSIS 0.4... >>> estimate_scaling([np.zeros((1200, 800, 3))]) 1.0
-
birl.utilities.dataset.
find_largest_object
(hist, threshold=0.01)[source]¶ find the largest objects and give its beginning end end
Parameters: Return list(int): >>> vec = np.array([1] * 15 + [0] * 5 + [1] * 20) >>> find_largest_object(vec) (20, 39)
-
birl.utilities.dataset.
find_split_objects
(hist, nb_objects=2, threshold=0.01)[source]¶ find the N largest objects and set split as middle distance among them
Parameters: Return list(int): >>> vec = np.array([1] * 15 + [0] * 5 + [1] * 20) >>> find_split_objects(vec) [17]
-
birl.utilities.dataset.
generate_pairing
(count, step_hide=None)[source]¶ generate registration pairs with an option of hidden landmarks
Parameters: - count (int) – total number of samples
- step_hide (int|None) – hide every N sample
Return list((int, int)), list(bool): registration pairs
>>> generate_pairing(4, None) # doctest: +NORMALIZE_WHITESPACE ([(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)], [True, True, True, True, True, True]) >>> generate_pairing(4, step_hide=3) # doctest: +NORMALIZE_WHITESPACE ([(0, 1), (0, 2), (1, 2), (3, 1), (3, 2)], [False, False, True, False, False])
-
birl.utilities.dataset.
get_close_diag_corners
(points)[source]¶ finds points closes to the top left and bottom right corner
Parameters: points (ndarray) – set of points Return tuple(ndarray,ndarray): begin and end of imaginary diagonal >>> np.random.seed(0) >>> points = np.random.randint(1, 9, (20, 2)) >>> get_close_diag_corners(points) (array([1, 2]), array([7, 8]), (12, 10))
-
birl.utilities.dataset.
histogram_match_cumulative_cdf
(source, reference, norm_img_size=1024)[source]¶ Adjust the pixel values of a gray-scale image such that its histogram matches that of a target image
Parameters: - source (ndarray) – 2D image to be transformed, np.array<height1, width1>
- reference (ndarray) – reference 2D image, np.array<height2, width2>
Return ndarray: transformed image, np.array<height1, width1>
>>> np.random.seed(0) >>> img = histogram_match_cumulative_cdf(np.random.randint(128, 145, (150, 200)), ... np.random.randint(0, 18, (200, 180))) >>> img.astype(int) # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS array([[13, 16, 0, ..., 12, 2, 5], [17, 9, 1, ..., 16, 9, 0], [11, 12, 14, ..., 8, 5, 4], ..., [12, 6, 3, ..., 15, 0, 3], [11, 17, 2, ..., 12, 12, 5], [ 6, 12, 3, ..., 8, 0, 1]]) >>> np.bincount(img.ravel()).astype(int) # doctest: +NORMALIZE_WHITESPACE array([1705, 1706, 1728, 1842, 1794, 1866, 1771, 0, 1717, 1752, 1757, 1723, 1823, 1833, 1749, 1718, 1769, 1747]) >>> img_source = np.random.randint(50, 245, (2500, 3000)).astype(float) >>> img_source[-1, -1] = 255 >>> img = histogram_match_cumulative_cdf(img_source / 255., img) >>> np.array(img.shape, dtype=int) array([2500, 3000])
-
birl.utilities.dataset.
image_histogram_matching
(source, reference, use_color='hsv', norm_img_size=4096)[source]¶ adjust image histogram between two images
Optionally transform the image to more continues color space. The source and target image does not need to be the same size, but RGB/gray.
See cor related information:
- https://www.researchgate.net/post/Histogram_matching_for_color_images
- https://github.com/scikit-image/scikit-image/blob/master/skimage/transform/histogram_matching.py
- https://stackoverflow.com/questions/32655686/histogram-matching-of-two-images-in-python-2-x
- https://github.com/mapbox/rio-hist/issues/3
Parameters: Return ndarray: transformed image
>>> from birl.utilities.data_io import update_path, load_image >>> path_imgs = os.path.join(update_path('data_images'), 'rat-kidney_', 'scale-5pc') >>> img1 = load_image(os.path.join(path_imgs, 'Rat-Kidney_HE.jpg')) >>> img2 = load_image(os.path.join(path_imgs, 'Rat-Kidney_PanCytokeratin.jpg')) >>> image_histogram_matching(img1, img2).shape == img1.shape True >>> img = image_histogram_matching(img1[..., 0], np.expand_dims(img2[..., 0], 2)) >>> img.shape == img1.shape[:2] True >>> # this should return unchanged source image >>> image_histogram_matching(np.random.random((10, 20, 30, 5)), ... np.random.random((30, 10, 20, 5))).ndim 4
-
birl.utilities.dataset.
inside_polygon
(polygon, point)[source]¶ check if a point is strictly inside the polygon
Parameters: - polygon (ndarray|list) – polygon contour
- point (tuple|list) – sample point
Return bool: inside
>>> poly = [[1, 1], [1, 3], [3, 3], [3, 1]] >>> inside_polygon(poly, [0, 0]) False >>> inside_polygon(poly, [1, 1]) False >>> inside_polygon(poly, [2, 2]) True
-
birl.utilities.dataset.
is_point_above_line
(point_begin, point_end, point_test)[source]¶ If point is left from line
Parameters: Return bool: left from line
>>> is_point_above_line([1, 1], [2, 2], [3, 4]) True
-
birl.utilities.dataset.
is_point_in_quadrant_left
(point_begin, point_end, point_test)[source]¶ If point is left quadrant from line end point
Note that negative response does not mean that that the point is on tight side
Parameters: Return int: gives +1 if it is above, -1 if bellow and 0 elsewhere
>>> is_point_in_quadrant_left([1, 1], [3, 1], [2, 2]) 1 >>> is_point_in_quadrant_left([3, 1], [1, 1], [2, 0]) 1 >>> is_point_in_quadrant_left([1, 1], [3, 1], [2, 0]) -1 >>> is_point_in_quadrant_left([1, 1], [3, 1], [4, 2]) 0
-
birl.utilities.dataset.
is_point_inside_perpendicular
(point_begin, point_end, point_test)[source]¶ If point is left from line and perpendicularly in between line segment
Note that negative response does not mean that that the point is on tight side
Parameters: Return int: gives +1 if it is above, -1 if bellow and 0 elsewhere
>>> is_point_inside_perpendicular([1, 1], [3, 1], [2, 2]) 1 >>> is_point_inside_perpendicular([1, 1], [3, 1], [2, 0]) -1 >>> is_point_inside_perpendicular([1, 1], [3, 1], [4, 2]) 0
-
birl.utilities.dataset.
line_angle_2d
(point_begin, point_end, deg=True)[source]¶ Compute direction of line with given two points
the zero is horizontal in direction [1, 0]
Parameters: Return float: orientation
>>> [line_angle_2d([0, 0], p) for p in ((1, 0), (0, 1), (-1, 0), (0, -1))] [0.0, 90.0, 180.0, -90.0] >>> line_angle_2d([1, 1], [2, 3]) # doctest: +ELLIPSIS 63.43... >>> line_angle_2d([1, 2], [-2, -3]) # doctest: +ELLIPSIS -120.96...
-
birl.utilities.dataset.
list_sub_folders
(path_folder, name='*')[source]¶ list all sub folders with particular name pattern
Parameters: Return list(str): folders
>>> from birl.utilities.data_io import update_path >>> paths = list_sub_folders(update_path('data_images')) >>> list(map(os.path.basename, paths)) # doctest: +ELLIPSIS ['images', 'landmarks', 'lesions_', 'rat-kidney_'...]
-
birl.utilities.dataset.
load_large_image
(img_path)[source]¶ loading very large images
- Note, for the loading we have to use matplotlib while ImageMagic nor other
- lib (opencv, skimage, Pillow) is able to load larger images then 64k or 32k.
Parameters: img_path (str) – path to the image Return ndarray: image
-
birl.utilities.dataset.
norm_angle
(angle, deg=True)[source]¶ Normalise to be in range (-180, 180) degrees
Parameters: Return float: norma angle
-
birl.utilities.dataset.
parse_path_scale
(path_folder)[source]¶ from given path with annotation parse scale
Parameters: path_folder (str) – path to the scale folder Return int: scale >>> parse_path_scale('scale-.1pc') nan >>> parse_path_scale('user-JB_scale-50pc') 50 >>> parse_path_scale('scale-10pc') 10
-
birl.utilities.dataset.
project_object_edge
(img, dimension)[source]¶ scale the image, binarise with Othu and project to one dimension
Parameters: - img (ndarray) –
- dimension (int) – select dimension for projection
Return list(float): >>> img = np.zeros((20, 10, 3)) >>> img[2:6, 1:7, :] = 1 >>> img[10:17, 4:6, :] = 1 >>> project_object_edge(img, 0).tolist() # doctest: +NORMALIZE_WHITESPACE [0.0, 0.0, 0.7, 0.7, 0.7, 0.7, 0.0, 0.0, 0.0, 0.0, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0, 0.0, 0.0]
-
birl.utilities.dataset.
save_large_image
(img_path, img)[source]¶ saving large images more then 50k x 50k
Note, for the saving we have to use openCV while other lib (matplotlib, Pillow, ITK) is not able to save larger images then 32k.
Parameters: - img_path (str) – path to the new image
- img (ndarray) – image
>>> img = np.zeros((2500, 3200, 4), dtype=np.uint8) >>> img[:, :, 0] = 255 >>> img[:, :, 1] = 127 >>> img_path = './sample-image.jpg' >>> save_large_image(img_path, img) >>> img2 = load_large_image(img_path) >>> img2[0, 0].tolist() [255, 127, 0] >>> img.shape[:2] == img2.shape[:2] True >>> os.remove(img_path) >>> img_path = './sample-image.png' >>> save_large_image(img_path, img.astype(np.uint16) * 255) >>> img3 = load_large_image(img_path) >>> img.shape[:2] == img3.shape[:2] True >>> img3[0, 0].tolist() [255, 127, 0] >>> save_large_image(img_path, img2 / 255. * 1.15) # test overwrite message >>> os.remove(img_path)
-
birl.utilities.dataset.
scale_large_images_landmarks
(images, landmarks)[source]¶ scale images and landmarks up to maximal image size
Parameters: Return tuple(list(ndarray),list(ndarray)): lists of images and landmarks
>>> scale_large_images_landmarks([np.zeros((8000, 500, 3), dtype=np.uint8)], ... [None, None]) # doctest: +ELLIPSIS ([array(...)], [None, None])
-
birl.utilities.dataset.
simplify_polygon
(points, tol_degree=5)[source]¶ simplify path, drop point on the same line
Parameters: - points (ndarray) – point in polygon
- tol_degree (float) – tolerance on change in orientation
Return list(list(float)): pints of polygon
>>> pts = [[1, 2], [2, 4], [1, 5], [2, 8], [3, 8], [5, 8], [7, 8], [8, 7], ... [8, 5], [8, 3], [8, 1], [7, 1], [6, 1], [4, 1], [3, 1], [3, 2], [2, 2]] >>> simplify_polygon(pts) [[1, 2], [2, 4], [1, 5], [2, 8], [7, 8], [8, 7], [8, 1], [3, 1], [3, 2]]
-
birl.utilities.dataset.
CONVERT_RGB
= {'hed': (<sphinx.ext.autodoc.importer._MockObject object>, <sphinx.ext.autodoc.importer._MockObject object>), 'hsv': (<sphinx.ext.autodoc.importer._MockObject object>, <sphinx.ext.autodoc.importer._MockObject object>), 'lab': (<sphinx.ext.autodoc.importer._MockObject object>, <sphinx.ext.autodoc.importer._MockObject object>), 'lch': (<function <lambda>>, <function <lambda>>), 'luv': (<sphinx.ext.autodoc.importer._MockObject object>, <sphinx.ext.autodoc.importer._MockObject object>), 'rgb': (<function <lambda>>, <function <lambda>>)}[source]¶ define pair of forward and backward color space conversion
-
birl.utilities.dataset.
IMAGE_EXTENSIONS
= ('.png', '.jpg', '.jpeg')[source]¶ supported image extensions
-
birl.utilities.dataset.
MAX_IMAGE_SIZE
= 5000[source]¶ maximal image size for visualisations, larger images will be downscaled