birl.utilities.dataset module¶
Some functionality related to dataset
Copyright (C) 2016-2019 Jiri Borovec <jiri.borovec@fel.cvut.cz>
-
birl.utilities.dataset.
args_expand_images
(parser, nb_workers=1, overwrite=True)[source]¶ - expand the parser by standard parameters related to images:
- image paths
- allow overwrite (optional)
- number of jobs
Parameters: Return obj: >>> import argparse >>> args_expand_images(argparse.ArgumentParser()) # doctest: +ELLIPSIS ArgumentParser(...)
-
birl.utilities.dataset.
args_expand_parse_images
(parser, nb_workers=1, overwrite=True)[source]¶ - expand the parser by standard parameters related to images:
- image paths
- allow overwrite (optional)
- number of jobs
Parameters: Return dict:
-
birl.utilities.dataset.
common_landmarks
(points1, points2, threshold=1.5)[source]¶ find common landmarks in two sets
Parameters: Return list(bool): flags
>>> np.random.seed(0) >>> common = np.random.random((5, 2)) >>> pts1 = np.vstack([common, np.random.random((10, 2))]) >>> pts2 = np.vstack([common, np.random.random((15, 2))]) >>> common_landmarks(pts1, pts2, threshold=1e-3) array([[0, 0], [1, 1], [2, 2], [3, 3], [4, 4]]) >>> np.random.shuffle(pts2) >>> common_landmarks(pts1, pts2, threshold=1e-3) array([[ 0, 13], [ 1, 10], [ 2, 9], [ 3, 14], [ 4, 8]])
-
birl.utilities.dataset.
compute_bounding_polygon
(landmarks)[source]¶ get the polygon where all point lies inside
Parameters: landmarks (ndarray) – set of points Return ndarray: pints of polygon >>> np.random.seed(0) >>> points = np.random.randint(1, 9, (45, 2)) >>> compute_bounding_polygon(points) # doctest: +NORMALIZE_WHITESPACE [[1, 2], [2, 4], [1, 5], [2, 8], [7, 8], [8, 7], [8, 1], [3, 1], [3, 2]]
-
birl.utilities.dataset.
compute_convex_hull
(landmarks)[source]¶ compute convex hull around landmarks
- http://lagrange.univ-lyon1.fr/docs/scipy/0.17.1/generated/scipy.spatial.ConvexHull.html
- https://stackoverflow.com/questions/21727199
Parameters: landmarks (ndarray) – set of points Return ndarray: pints of polygon >>> np.random.seed(0) >>> pts = np.random.randint(15, 30, (10, 2)) >>> compute_convex_hull(pts) array([[27, 20], [27, 25], [22, 24], [16, 21], [15, 18], [26, 18]])
-
birl.utilities.dataset.
compute_half_polygon
(landmarks, idx_start=0, idx_end=-1)[source]¶ compute half polygon path
Parameters: Return ndarray: set of points
>>> pts = [(-1, 1), (0, 0), (0, 2), (1, 1), (1, -0.5), (2, 0)] >>> compute_half_polygon(pts, idx_start=0, idx_end=-1) [[-1.0, 1.0], [0.0, 2.0], [1.0, 1.0], [2.0, 0.0]] >>> compute_half_polygon(pts[:2], idx_start=-1, idx_end=0) [[-1, 1], [0, 0]] >>> pts = [[0, 2], [1, 5], [2, 4], [2, 5], [4, 4], [4, 6], [4, 8], [5, 8], [5, 8]] >>> compute_half_polygon(pts) [[0, 2], [1, 5], [2, 5], [4, 6], [4, 8], [5, 8]]
-
birl.utilities.dataset.
convert_landmarks_from_itk
(lnds, image_size)[source]¶ converting ITK format to used in ImageJ
Parameters: Return ndarray: landmarks
>>> convert_landmarks_from_itk([[ 20, 145], [150, 50], [100, 150]], (150, 200)) array([[ 5, 20], [100, 150], [ 0, 100]]) >>> lnds = [[ 20, 145], [150, 50], [100, 150], [0, 0], [150, 200]] >>> img_size = (150, 200) >>> lnds2 = convert_landmarks_from_itk(convert_landmarks_to_itk(lnds, img_size), img_size) >>> np.array_equal(lnds, lnds2) True
-
birl.utilities.dataset.
convert_landmarks_to_itk
(lnds, image_size)[source]¶ converting used landmarks to ITK format
Parameters: Return ndarray: landmarks
>>> convert_landmarks_to_itk([[5, 20], [100, 150], [0, 100]], (150, 200)) array([[ 20, 145], [150, 50], [100, 150]])
-
birl.utilities.dataset.
detect_binary_blocks
(vec_bin)[source]¶ detect the binary object by beginning, end and length in !d signal
Parameters: vec_bin (list(bool)) – binary vector with 1 for an object Return tuple(list(int),list(int),list(int)): >>> vec = np.array([1] * 15 + [0] * 5 + [1] * 20) >>> detect_binary_blocks(vec) ([0, 20], [15, 39], [14, 19])
-
birl.utilities.dataset.
estimate_scaling
(images, max_size=5000)[source]¶ find scaling for given set of images and maximal image size
Parameters: Return float: scaling in range (0, 1)
>>> estimate_scaling([np.zeros((12000, 300, 3))]) # doctest: +ELLIPSIS 0.4... >>> estimate_scaling([np.zeros((1200, 800, 3))]) 1.0
-
birl.utilities.dataset.
find_largest_object
(hist, threshold=0.01)[source]¶ find the largest objects and give its beginning end end
Parameters: Return list(int): >>> vec = np.array([1] * 15 + [0] * 5 + [1] * 20) >>> find_largest_object(vec) (20, 39)
-
birl.utilities.dataset.
find_split_objects
(hist, nb_objects=2, threshold=0.01)[source]¶ find the N largest objects and set split as middle distance among them
Parameters: Return list(int): >>> vec = np.array([1] * 15 + [0] * 5 + [1] * 20) >>> find_split_objects(vec) [17]
-
birl.utilities.dataset.
generate_pairing
(count, step_hide=None)[source]¶ generate registration pairs with an option of hidden landmarks
Parameters: - count (int) – total number of samples
- step_hide (int|None) – hide every N sample
Return list((int, int)), list(bool): registration pairs
>>> generate_pairing(4, None) # doctest: +NORMALIZE_WHITESPACE ([(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)], [True, True, True, True, True, True]) >>> generate_pairing(4, step_hide=3) # doctest: +NORMALIZE_WHITESPACE ([(0, 1), (0, 2), (1, 2), (3, 1), (3, 2)], [False, False, True, False, False])
-
birl.utilities.dataset.
get_close_diag_corners
(points)[source]¶ finds points closes to the top left and bottom right corner
Parameters: points (ndarray) – set of points Return tuple(ndarray,ndarray): begin and end of imaginary diagonal >>> np.random.seed(0) >>> points = np.random.randint(1, 9, (20, 2)) >>> get_close_diag_corners(points) (array([1, 2]), array([7, 8]), (12, 10))
-
birl.utilities.dataset.
histogram_match_cumulative_cdf
(source, reference, norm_img_size=1024)[source]¶ Adjust the pixel values of a gray-scale image such that its histogram matches that of a target image
Parameters: - source (ndarray) – 2D image to be transformed, np.array<height1, width1>
- reference (ndarray) – reference 2D image, np.array<height2, width2>
Return ndarray: transformed image, np.array<height1, width1>
>>> np.random.seed(0) >>> img = histogram_match_cumulative_cdf(np.random.randint(128, 145, (150, 200)), ... np.random.randint(0, 18, (200, 180))) >>> img.astype(int) # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS array([[13, 16, 0, ..., 12, 2, 5], [17, 9, 1, ..., 16, 9, 0], [11, 12, 14, ..., 8, 5, 4], ..., [12, 6, 3, ..., 15, 0, 3], [11, 17, 2, ..., 12, 12, 5], [ 6, 12, 3, ..., 8, 0, 1]]) >>> np.bincount(img.ravel()).astype(int) # doctest: +NORMALIZE_WHITESPACE array([1705, 1706, 1728, 1842, 1794, 1866, 1771, 0, 1717, 1752, 1757, 1723, 1823, 1833, 1749, 1718, 1769, 1747]) >>> img_source = np.random.randint(50, 245, (2500, 3000)).astype(float) >>> img_source[-1, -1] = 255 >>> img = histogram_match_cumulative_cdf(img_source / 255., img) >>> np.array(img.shape, dtype=int) array([2500, 3000])
-
birl.utilities.dataset.
image_histogram_matching
(source, reference, use_color='hsv', norm_img_size=4096)[source]¶ adjust image histogram between two images
Optionally transform the image to more continues color space. The source and target image does not need to be the same size, but RGB/gray.
See cor related information:
- https://www.researchgate.net/post/Histogram_matching_for_color_images
- https://github.com/scikit-image/scikit-image/blob/master/skimage/transform/histogram_matching.py
- https://stackoverflow.com/questions/32655686/histogram-matching-of-two-images-in-python-2-x
- https://github.com/mapbox/rio-hist/issues/3
Parameters: Return ndarray: transformed image
>>> from birl.utilities.data_io import update_path, load_image >>> path_imgs = os.path.join(update_path('data_images'), 'rat-kidney_', 'scale-5pc') >>> img1 = load_image(os.path.join(path_imgs, 'Rat-Kidney_HE.jpg')) >>> img2 = load_image(os.path.join(path_imgs, 'Rat-Kidney_PanCytokeratin.jpg')) >>> image_histogram_matching(img1, img2).shape == img1.shape True >>> img = image_histogram_matching(img1[..., 0], np.expand_dims(img2[..., 0], 2)) >>> img.shape == img1.shape[:2] True >>> # this should return unchanged source image >>> image_histogram_matching(np.random.random((10, 20, 30, 5)), ... np.random.random((30, 10, 20, 5))).ndim 4
-
birl.utilities.dataset.
inside_polygon
(polygon, point)[source]¶ check if a point is strictly inside the polygon
Parameters: - polygon (ndarray|list) – polygon contour
- point (tuple|list) – sample point
Return bool: inside
>>> poly = [[1, 1], [1, 3], [3, 3], [3, 1]] >>> inside_polygon(poly, [0, 0]) False >>> inside_polygon(poly, [1, 1]) False >>> inside_polygon(poly, [2, 2]) True
-
birl.utilities.dataset.
is_point_above_line
(point_begin, point_end, point_test)[source]¶ If point is left from line
Parameters: Return bool: left from line
>>> is_point_above_line([1, 1], [2, 2], [3, 4]) True
-
birl.utilities.dataset.
is_point_in_quadrant_left
(point_begin, point_end, point_test)[source]¶ If point is left quadrant from line end point
Note
negative response does not mean that that the point is on tight side
Parameters: Return int: gives +1 if it is above, -1 if bellow and 0 elsewhere
>>> is_point_in_quadrant_left([1, 1], [3, 1], [2, 2]) 1 >>> is_point_in_quadrant_left([3, 1], [1, 1], [2, 0]) 1 >>> is_point_in_quadrant_left([1, 1], [3, 1], [2, 0]) -1 >>> is_point_in_quadrant_left([1, 1], [3, 1], [4, 2]) 0
-
birl.utilities.dataset.
is_point_inside_perpendicular
(point_begin, point_end, point_test)[source]¶ If point is left from line and perpendicularly in between line segment
Note
negative response does not mean that that the point is on tight side
Parameters: Return int: gives +1 if it is above, -1 if bellow and 0 elsewhere
>>> is_point_inside_perpendicular([1, 1], [3, 1], [2, 2]) 1 >>> is_point_inside_perpendicular([1, 1], [3, 1], [2, 0]) -1 >>> is_point_inside_perpendicular([1, 1], [3, 1], [4, 2]) 0
-
birl.utilities.dataset.
line_angle_2d
(point_begin, point_end, deg=True)[source]¶ Compute direction of line with given two points
the zero is horizontal in direction [1, 0]
Parameters: Return float: orientation
>>> [line_angle_2d([0, 0], p) for p in ((1, 0), (0, 1), (-1, 0), (0, -1))] [0.0, 90.0, 180.0, -90.0] >>> line_angle_2d([1, 1], [2, 3]) # doctest: +ELLIPSIS 63.43... >>> line_angle_2d([1, 2], [-2, -3]) # doctest: +ELLIPSIS -120.96...
-
birl.utilities.dataset.
list_sub_folders
(path_folder, name='*')[source]¶ list all sub folders with particular name pattern
Parameters: Return list(str): folders
>>> from birl.utilities.data_io import update_path >>> paths = list_sub_folders(update_path('data_images')) >>> list(map(os.path.basename, paths)) # doctest: +ELLIPSIS ['images', 'landmarks', 'lesions_', 'rat-kidney_'...]
-
birl.utilities.dataset.
load_large_image
(img_path)[source]¶ loading very large images
Note
For the loading we have to use matplotlib while ImageMagic nor other lib (opencv, skimage, Pillow) is able to load larger images then 64k or 32k.
Parameters: img_path (str) – path to the image Return ndarray: image
-
birl.utilities.dataset.
norm_angle
(angle, deg=True)[source]¶ Normalise to be in range (-180, 180) degrees
Parameters: Return float: norma angle
-
birl.utilities.dataset.
parse_path_scale
(path_folder)[source]¶ from given path with annotation parse scale
Parameters: path_folder (str) – path to the scale folder Return int: scale >>> parse_path_scale('scale-.1pc') nan >>> parse_path_scale('user-JB_scale-50pc') 50 >>> parse_path_scale('scale-10pc') 10
-
birl.utilities.dataset.
project_object_edge
(img, dimension)[source]¶ scale the image, binarise with Othu and project to one dimension
Parameters: - img (ndarray) –
- dimension (int) – select dimension for projection
Return list(float): >>> img = np.zeros((20, 10, 3)) >>> img[2:6, 1:7, :] = 1 >>> img[10:17, 4:6, :] = 1 >>> project_object_edge(img, 0).tolist() # doctest: +NORMALIZE_WHITESPACE [0.0, 0.0, 0.7, 0.7, 0.7, 0.7, 0.0, 0.0, 0.0, 0.0, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0, 0.0, 0.0]
-
birl.utilities.dataset.
save_large_image
(img_path, img)[source]¶ saving large images more then 50k x 50k
Note
For the saving we have to use openCV while other
lib (matplotlib, Pillow, ITK) is not able to save larger images then 32k.
Parameters: - img_path (str) – path to the new image
- img (ndarray) – image
>>> img = np.zeros((2500, 3200, 4), dtype=np.uint8) >>> img[:, :, 0] = 255 >>> img[:, :, 1] = 127 >>> img_path = './sample-image.jpg' >>> save_large_image(img_path, img) >>> img2 = load_large_image(img_path) >>> img2[0, 0].tolist() [255, 127, 0] >>> img.shape[:2] == img2.shape[:2] True >>> os.remove(img_path) >>> img_path = './sample-image.png' >>> save_large_image(img_path, img.astype(np.uint16) * 255) >>> img3 = load_large_image(img_path) >>> img.shape[:2] == img3.shape[:2] True >>> img3[0, 0].tolist() [255, 127, 0] >>> save_large_image(img_path, img2 / 255. * 1.15) # test overwrite message >>> os.remove(img_path)
-
birl.utilities.dataset.
scale_large_images_landmarks
(images, landmarks)[source]¶ scale images and landmarks up to maximal image size
Parameters: Return tuple(list(ndarray),list(ndarray)): lists of images and landmarks
>>> scale_large_images_landmarks([np.zeros((8000, 500, 3), dtype=np.uint8)], ... [None, None]) # doctest: +ELLIPSIS ([array(...)], [None, None])
-
birl.utilities.dataset.
simplify_polygon
(points, tol_degree=5)[source]¶ simplify path, drop point on the same line
Parameters: - points (ndarray) – point in polygon
- tol_degree (float) – tolerance on change in orientation
Return list(list(float)): pints of polygon
>>> pts = [[1, 2], [2, 4], [1, 5], [2, 8], [3, 8], [5, 8], [7, 8], [8, 7], ... [8, 5], [8, 3], [8, 1], [7, 1], [6, 1], [4, 1], [3, 1], [3, 2], [2, 2]] >>> simplify_polygon(pts) [[1, 2], [2, 4], [1, 5], [2, 8], [7, 8], [8, 7], [8, 1], [3, 1], [3, 2]]
-
birl.utilities.dataset.
CONVERT_RGB
= {'hed': (<sphinx.ext.autodoc.importer._MockObject object>, <sphinx.ext.autodoc.importer._MockObject object>), 'hsv': (<sphinx.ext.autodoc.importer._MockObject object>, <sphinx.ext.autodoc.importer._MockObject object>), 'lab': (<sphinx.ext.autodoc.importer._MockObject object>, <sphinx.ext.autodoc.importer._MockObject object>), 'lch': (<function <lambda>>, <function <lambda>>), 'luv': (<sphinx.ext.autodoc.importer._MockObject object>, <sphinx.ext.autodoc.importer._MockObject object>), 'rgb': (<function <lambda>>, <function <lambda>>)}[source]¶ define pair of forward and backward color space conversion
-
birl.utilities.dataset.
IMAGE_EXTENSIONS
= ('.png', '.jpg', '.jpeg')[source]¶ supported image extensions
-
birl.utilities.dataset.
MAX_IMAGE_SIZE
= 5000[source]¶ maximal image size for visualisations, larger images will be downscaled