Annotation¶
-
class
pylidc.
Annotation
(**kwargs)[source]¶ The Nodule model class holds the information from a single physicians annotation of a nodule >= 3mm class with a particular scan. A nodule has many contours, each of which refers to the contour drawn for nodule in each scan slice.
-
subtlety
¶ int, range = {1,2,3,4,5} – Difficulty of detection. Higher values indicate easier detection.
- ‘Extremely Subtle’
- ‘Moderately Subtle’
- ‘Fairly Subtle’
- ‘Moderately Obvious’
- ‘Obvious’
-
internalStructure
¶ int, range = {1,2,3,4} – Internal composition of the nodule.
- ‘Soft Tissue’
- ‘Fluid’
- ‘Fat’
- ‘Air’
-
calcification
¶ int, range = {1,2,3,4,6} – Pattern of calcification, if present.
- ‘Popcorn’
- ‘Laminated’
- ‘Solid’
- ‘Non-central’
- ‘Central’
- ‘Absent’
-
sphericity
¶ int, range = {1,2,3,4,5} – The three-dimensional shape of the nodule in terms of its roundness.
- ‘Linear’
- ‘Ovoid/Linear’
- ‘Ovoid’
- ‘Ovoid/Round’
- ‘Round’
-
margin
¶ int, range = {1,2,3,4,5} – Description of how well-defined the nodule margin is.
- ‘Poorly Defined’
- ‘Near Poorly Defined’
- ‘Medium Margin’
- ‘Near Sharp’
- ‘Sharp’
-
lobulation
¶ int, range = {1,2,3,4,5} – The degree of lobulation ranging from none to marked
- ‘No Lobulation’
- ‘Nearly No Lobulation’
- ‘Medium Lobulation’
- ‘Near Marked Lobulation’
- ‘Marked Lobulation’
-
spiculation
¶ int, range = {1,2,3,4,5} – The extent of spiculation present.
- ‘No Spiculation’
- ‘Nearly No Spiculation’
- ‘Medium Spiculation’
- ‘Near Marked Spiculation’
- ‘Marked Spiculation’
-
texture
¶ int, range = {1,2,3,4,5} – Radiographic solidity: internal texture (solid, ground glass, or mixed).
- ‘Non-Solid/GGO’
- ‘Non-Solid/Mixed’
- ‘Part Solid/Mixed’
- ‘Solid/Mixed’
- ‘Solid’
-
malignancy
¶ int, range = {1,2,3,4,5} – Subjective assessment of the likelihood of malignancy, assuming the scan originated from a 60-year-old male smoker.
- ‘Highly Unlikely’
- ‘Moderately Unlikely’
- ‘Indeterminate’
- ‘Moderately Suspicious’
- ‘Highly Suspicious’
Example
A short usage example for the Annotation class:
import pylidc as pl # Get the first annotation with spiculation value greater than 3. ann = pl.query(pl.Annotation)\ .filter(pl.Annotation.spiculation > 3).first() print(ann.spiculation) # => 4 # Each nodule feature has a corresponding property # to print the semantic value. print(ann.Spiculation) # => Medium-High Spiculation ann = anns.first() print("%.2f, %.2f, %.2f" % (ann.diameter, ann.surface_area, ann.volume)) # => 17.98, 1221.40, 1033.70
-
Calcification
¶ Semantic interpretation of calcification value as string.
-
InternalStructure
¶ Semantic interpretation of internalStructure value as string.
-
Lobulation
¶ Semantic interpretation of lobulation value as string.
-
Malignancy
¶ Semantic interpretation of malignancy value as string.
-
Margin
¶ Semantic interpretation of margin value as string.
-
Sphericity
¶ Semantic interpretation of sphericity value as string.
-
Spiculation
¶ Semantic interpretation of spiculation value as string.
-
Subtlety
¶ Semantic interpretation of subtlety value as string.
-
Texture
¶ Semantic interpretation of texture value as string.
-
bbox
(pad=None)[source]¶ Returns a tuple of Python slice objects that can be used to index into the image volume corresponding to the extent of the (padded) bounding box.
Parameters: pad (int, list of ints, or float, default=None) – - If None (default), then no padding is used.
- If an integer is provided, then the bounding box is padded uniformly by this integer amount.
- If a list of integers is provided, then it is of the form:
[(i1,i2), (j1,j2), (k1,k2)]
and indicates the pad amounts along each coordinate axis.
- If a float is provided, then the slices are padded such that the bounding box occupies at least pad physical units (using the corresponding scan pixel_spacing and slice_spacing parameters). This means the returned Slice indices will yield a bounding box that is at least pad millimeters along each coordinate axis direction.
Note
In the various pad cases above, borders are handled so that if a pad beyond the image borders is requested, then it is set to the maximum (or minimum, depending on the direction) possible index.
Returns: bb – bb is the corresponding bounding box (with desired padding) in the CT image volume. bb[i] is a slice corresponding to the the extent of the bounding box along the coordinate axis i. Return type: 3-tuple of Python slice objects. Example
The example below illustrates the various pad argument types:
import pylidc as pl ann = pl.query(pl.Annotation).first() vol = ann.scan.to_volume() print ann.bbox() # => (slice(151, 185, None), slice(349, 376, None), slice(44, 50, None)) print(vol[ann.bbox()].shape) # => (34, 27, 6) print(vol[ann.bbox(pad=2)].shape) # => (38, 31, 10) print(vol[ann.bbox(pad=[(1,2), (3,0), (2,4)])].shape) # => (37, 30, 12) print(max(ann.bbox_dims())) # => 21.45 print(vol[ann.bbox(pad=30.0)].shape) # => (48, 49, 12) print(ann.bbox_dims(pad=30.0)) # => [30.55, 31.200000000000003, 33.0]
-
bbox_dims
(pad=None)[source]¶ Return the physical dimensions of the nodule bounding box in millimeters along each coordinate axis.
Parameters: pad (int, list, or float, default=None) – See pylidc.Annotation.bbox()
for a description of this argument.Returns: dims – dims[i] is the length in millimeters of the bounding box along the coordinate axis i. Return type: ndarray, shape=(3,) Example
An example where we compare the bounding box volume vs the nodule volume:
import pylidc as pl ann = pl.query(pl.Annotation).first() print("%.2f mm^3, %.2f mm^3" % (ann.volume, np.prod(ann.bbox_dims()))) # => 2439.30 mm^3, 5437.58 mm^3
-
bbox_matrix
(pad=None)[source]¶ The bbox function returns a tuple of slices to be used to index into an image volume. On the other hand, bbox_array returns a 3x2 matrix where each row is the (start, stop) indices of the i, j, and k axes.
Parameters: pad (int, list, or float) – See pylidc.Annotation.bbox()
for a description of this argument.Note
The indices return by bbox_array are inclusive, whereas the indices of the slice objects in the tuple return by bbox are offset by +1 in the “stop” index.
Returns: bb_mat – bb_mat[i] is the stop and start indices (inclusive) of the bounding box along coordinate axis i. Return type: ndarray, shape=(3,2) Example
An example of the difference between bbox and bbox_matrix:
import pylidc as pl ann = pl.query(pl.Annotation).first() bb = ann.bbox() bm = ann.bbox_matrix() print(all([bm[i,0] == bb[i].start for i in range(3)])) # => True print(all([bm[i,1]+1 == bb[i].stop for i in range(3)])) # => True
-
boolean_mask
(pad=None, bbox=None)[source]¶ A boolean volume where 1 indicates nodule and 0 indicates non-nodule. The mask volume covers the extent of the voxels in the image volume given by annotation.bbox, i.e., the mask volume would be placed in the full image volume according to the bbox attribute.
Parameters: - pad (int, list, or float, default=None) – See
pylidc.Annotation.bbox()
for a description of this argument. - bbox (3x2 NumPy array, default=None) – If bbox is provided, then pad is ignored. This argument allows for more fine-tuned control of placement of the mask in a volume, or for pre-computation of bbox when working with multiple Annotation object.
Example
An example:
import pylidc as pl import matplotlib.pyplot as plt ann = pl.query(pl.Annotation).first() vol = ann.scan.to_volume() mask = ann.boolean_mask() bbox = ann.bbox() print("Avg HU inside nodule: %.1f" % vol[bbox][mask].mean()) # => Avg HU inside nodule: -280.0 print("Avg HU outside nodule: %.1f" % vol[bbox][~mask].mean()) # => Avg HU outside nodule: -732.2
- pad (int, list, or float, default=None) – See
-
centroid
¶ The center of mass of the nodule as determined by its radiologist-drawn contours.
Example
An example of plotting the centroid on a CT image slice:
import pylidc as pl import matplotlib.pyplot as plt ann = pl.query(pl.Annotation).first() i,j,k = ann.centroid vol = ann.scan.to_volume() plt.imshow(vol[:,:,int(k)], cmap=plt.cm.gray) plt.plot(j, i, '.r', label="Nodule centroid") plt.legend() plt.show()
Returns: centr – centr[i] is the average index value of all contour index values for coordinate axis i. Return type: ndarray, shape=(3,)
-
contour_slice_indices
¶ Returns an array of indices into the scan where each contour belongs. An example should clarify:
import pylidc as pl ann = pl.query(pl.Annotation) zvals = ann.contour_slice_zvals kvals = ann.contour_slice_indices scan_zvals = ann.scan.slice_zvals for k,z in zip(kvals, zvals): # the two z values should the same (up to machine precision) print(k, z, scan_zvals[k])
-
contour_slice_zvals
¶ An array of unique z-coordinates for the contours.
-
contours_matrix
¶ All the contour index values a 3D numpy array.
-
diameter
¶ Estimate the greatest axial plane diameter using the annotation’s contours. This estimation does not currently account for cases where the diamter passes outside the boundary of the nodule, or through cavities within the nodule.
Returns: diam – The maximal diameter as float, accounting for the axial-plane resolution of the scan. The units are mm. Return type: float
-
feature_vals
(return_str=False)[source]¶ Return all feature values as a numpy array in the order presented in feature_names.
Parameters: return_str (bool, default=False) – If True, a list of strings is also returned, corresponding to the meaning of each numerical feature value. Returns: fvals[, fstrs] – fvals is an array of numerical values corresponding to the numerical feature values for the annotation. fstrs is a list of semantic string interpretations of the numerical values given in fvals. Return type: array[, list of strings]
-
surface_area
¶ Estimate the surface area by summing the areas of a trianglation of the nodules surface in 3d. Returned units are mm^2.
Returns: sa – The estimated surface area in squared millimeters. Return type: float
-
uniform_cubic_resample
(side_length=None, resample_vol=True, irp_pts=None, return_irp_pts=False, resample_img=True, verbose=True)[source]¶ Get the CT value volume and respective boolean mask volume. The volumes are interpolated and resampled to have uniform spacing of 1mm along each dimension. The resulting volumes are cubic of the specified side_length. Thus, the returned volumes have dimensions, (side_length+1,)*3 (since side_length is the spacing).
Todo
It would be nice if this function performed fully general interpolation, i.e., not necessarily uniform spacing and allowing different resample-resolutions along different coordinate axes.
Parameters: - side_length (integer, default=None) –
The physical length of each side of the new cubic volume in millimeters. The default, None, takes the max of the nodule’s bounding box dimensions.
If this parameter is not None, then it should be greater than any bounding box dimension. If the specified side_length requires a padding which results in an out-of-bounds image index, then the image is padded with the minimum CT image value.
- resample_vol (boolean, default=True) – If False, only the segmentation volume is resampled.
- irp_pts (3-tuple from meshgrid) – If provided, the volume(s) will be resampled over these interpolation points, rather than the automatically calculated points. This allows for sampling segmentation volumes over a common coordinate-system.
- return_irp_pts (boolean, default=False) – If True, the interpolation points (ix,iy,iz) at which the volume(s) were resampled are returned. These can potentially be provided as an argument to irp_pts for separate selfotations that refer to the same nodule, allowing the segmentation volumes to be resampled in a common coordinate-system.
- verbose (boolean, default=True) – Turn the loading statement on / off.
Returns: [ct_volume,] mask [, irp_pts] – ct_volume and mask are the resampled CT and boolean volumes, respectively. ct_volume and irp_points are optionally returned, depending on which flags are set (see above).
Return type: ndarray, ndarray, list of ndarrays
Example
An example:
import numpy as np import matplotlib.pyplot as plt import pylidc as pl ann = pl.query(pl.Annotation).first() # resampled volumes will have uniform side length of 70mm and # uniform voxel spacing of 1mm. n = 70 vol,mask = ann.uniform_cubic_resample(n) # Setup the plot. img = plt.imshow(np.zeros((n+1, n+1)), vmin=vol.min(), vmax=vol.max(), cmap=plt.cm.gray) # View all the resampled image volume slices. for i in range(n+1): img.set_data(vol[:,:,i] * (mask[:,:,i]*0.6+0.2)) plt.title("%02d / %02d" % (i+1, n)) plt.pause(0.1)
- side_length (integer, default=None) –
-
visualize_in_3d
(edgecolor='0.2', cmap='viridis', step=1, figsize=(5, 5), backend='matplotlib')[source]¶ Visualize in 3d a triangulation of the nodule’s surface.
Parameters: - edgecolor (string color or rgb 3-tuple) – Sets edgecolors of triangulation. Ignored if backend != matplotlib.
- cmap (matplotlib colormap string.) – Sets the facecolors of the triangulation. See matplotlib.cm.cmap_d.keys() for all available. Ignored if backend != matplotlib.
- step (int, default=1) – The step_size parameter for the skimage marching_cubes function. Bigger values are quicker, but yield coarser surfaces.
- figsize (tuple, default=(5,5)) – Figure size for the displayed volume.
- backend (string) – The backend for visualization. Default is matplotlib. Execute from pylidc.Annotation import viz3dbackends to see available backends.
Example
A short example:
ann = pl.query(pl.Annotation).first() ann.visualize_in_3d(edgecolor='green', cmap='autumn')
-
visualize_in_scan
(verbose=True)[source]¶ Engage an interactive visualization of the slices of the scan along with scan and annotation information.
The visualization begins (but is not limited to) the first slice where the nodule occurs (according to the annotation). Annotation contours are plotted on top of the images for visualization and can be toggled on and off, using an interactive check mark utility.
Parameters: verbose (bool, default=True) – Turn the image loading statement on/off.
-
volume
¶ Estimate the volume of the annotated nodule, using the contour annotations. Green’s theorem (via the shoelace formula) is first used to measure the area in each slice. This area is multiplied by the distance between slices to obtain a volume for each slice, which is then added or subtracted from the total volume, depending on if the inclusion value for the contour.
The distance between slices is taken to be the distance from the midpoint between the current image_z_position and the image_z_position in one slice higher plus the midpoint between the current image_z_position and the image_z_position of one slice below. If the the image_z_position corresponds to an end piece, we use the distance between the current image_z_posiition and the image_z_position of one slice below or above for top or bottom, respectively. If the annotation only has one contour, we use the slice_thickness attribute of the scan.
Returns: vol – The estimated 3D volume of the annotated nodule. Units are cubic millimeters. Return type: float
-