API Reference

This section provides detailed documentation for all modules in the cryoblob package.

Main Package

Module: cryoblob

JAX based, JIT compiled, scalable codes for detection of amorphous blobs in low SNR cryo-EM images.

Submodules

  • adapt:

    Adaptive image processing methods that take advantage of JAX’s automatic differentiation capabilities. The functions are: - adaptive_wiener:

    Adaptive Wiener filter that optimizes the noise estimate using gradient descent.

    • adaptive_threshold:

      Adaptively optimizes thresholding parameters using gradient descent to produce a differentiably thresholded image.

  • blobs:

    Contains the core blob detection algorithms. The functions are: - find_connected_components:

    Pure JAX implementation of 3D connected components labeling.

    • center_of_mass_3d:

      Calculate center of mass for each labeled region in a 3D image.

    • find_particle_coords:

      Find particle coordinates using connected components and center of mass.

    • preprocessing:

      Pre-processes low SNR images to improve contrast of blobs.

    • blob_list_log:

      Detects blobs in an input image using the Laplacian of Gaussian (LoG) method.

  • files:

    Interfacing with data files. The functions are: - file_params:

    Get the parameters for the file organization.

    • load_mrc:

      Reads an MRC-format cryo-EM file, extracting image data and metadata.

    • process_single_file:

      Process a single file for blob detection with memory optimization.

    • process_batch_of_files:

      Process a batch of files in parallel with memory optimization.

    • folder_blobs:

      Process a folder of images for blob detection with memory optimization.

    • estimate_batch_size:

      Estimate optimal batch size for processing MRC files based on available memory.

    • estimate_memory_usage:

      Estimate memory usage in GB for processing a single MRC file.

    • get_optimal_batch_size:

      Get optimal batch size by sampling multiple files from the list.

  • image:

    Utility functions for image processing. The functions are: - image_resizer:

    Resize an image using a fast resizing algorithm implemented in JAX.

    • resize_x:

      Resize image along y-axis by independently resampling each column.

    • gaussian_kernel:

      Create a normalized 2D Gaussian kernel.

    • apply_gaussian_blur:

      Apply Gaussian blur to an image using convolution in JAX.

    • difference_of_gaussians:

      Applies Difference of Gaussians (DoG) filtering to enhance circular blobs.

    • laplacian_of_gaussian:

      Applies Laplacian of Gaussian (LoG) filtering to an input image.

    • laplacian_kernel:

      Create a Laplacian kernel for edge detection in a JAX-compatible manner.

    • exponential_kernel:

      Create an exponential kernel for image processing.

    • perona_malik:

      Perform edge-preserving denoising using the Perona-Malik anisotropic diffusion.

    • histogram:

      Calculate the histogram of an image.

    • equalize_hist:

      Perform histogram equalization on an image using JAX.

    • equalize_adapthist:

      Perform adaptive histogram equalization on an image using JAX.

    • wiener:

      Perform Wiener filtering on an image using JAX.

  • plots:

    Plotting functions for visualizing MRC images and blob detection results. The functions are: - plot_mrc:

    Plot an MRC image using Matplotlib with an optional scaling mode and scalebar.

  • types:

    Type aliases and PyTrees. The types are: - scalar_float:

    Zero dimensional floating point number

    • scalar_int:

      Zero dimensional integer.

    • scalar_num:

      Zero dimensional number, that can either be a floating point number or an integer.

    • non_jax_number:

      A number that is not a JAX array. This is because even single number are stored as 0D JAX arrays.

    The PyTrees are: - MRC_Image:

    A PyTree structure for MRC images. Contains the image data and metadata.

    The factory functions are: - make_MRC_Image:

    Factory function to create an MRC_Image instance.

  • valid:

    Pydantic models for data validation and configuration management. The classes are: - PreprocessingConfig:

    Configuration for image preprocessing parameters

    • BlobDetectionConfig:

      Configuration for blob detection parameters

    • FileProcessingConfig:

      Configuration for file processing and batch operations

    • MRCMetadata:

      Validation for MRC file metadata

    • ValidationPipeline:

      Main pipeline class for validating all configurations

class cryoblob.AdaptiveFilterConfig(*args, **kwargs)[source]

Bases: BaseModel

Configuration model for adaptive filtering parameters.

Validates parameters used in adaptive_wiener and adaptive_threshold functions.

validate_kernel_size()

Ensure kernel size is odd for proper centering.

class Config[source]

Bases: object

frozen = True
extra = 'forbid'
class cryoblob.BlobDetectionConfig(*args, **kwargs)[source]

Bases: BaseModel

Configuration model for blob detection parameters.

Validates parameters used in blob_list_log function.

validate_max_blob_size()

Ensure max_blob_size > min_blob_size.

class Config[source]

Bases: object

frozen = True
extra = 'forbid'
class cryoblob.FileProcessingConfig(*args, **kwargs)[source]

Bases: BaseModel

Configuration model for file processing and batch operations.

Validates parameters used in folder_blobs function.

validate_folder_exists()

Ensure the folder exists and is accessible.

class Config[source]

Bases: object

frozen = True
extra = 'forbid'
class cryoblob.MRCMetadata(*args, **kwargs)[source]

Bases: BaseModel

Validation model for MRC file metadata.

Ensures MRC file headers contain valid values.

validate_data_range()

Ensure data_max > data_min.

validate_mean_in_range()

Ensure data_mean is between data_min and data_max.

class Config[source]

Bases: object

frozen = True
extra = 'forbid'
class cryoblob.Path(*args, **kwargs)[source]

Bases: PurePath

PurePath subclass that can make system calls.

Path represents a filesystem path but unlike PurePath, also offers methods to do system calls on path objects. Depending on your system, instantiating a Path will return either a PosixPath or a WindowsPath object. You can also instantiate a PosixPath or WindowsPath directly, but cannot instantiate a WindowsPath on a POSIX system or vice versa.

classmethod cwd()[source]

Return a new path pointing to the current working directory (as returned by os.getcwd()).

classmethod home()[source]

Return a new path pointing to the user’s home directory (as returned by os.path.expanduser(‘~’)).

samefile(other_path)[source]

Return whether other_path is the same or not as this file (as returned by os.path.samefile()).

iterdir()[source]

Iterate over the files in this directory. Does not yield any result for the special paths ‘.’ and ‘..’.

glob(pattern)[source]

Iterate over this subtree and yield all existing files (of any kind, including directories) matching the given relative pattern.

rglob(pattern)[source]

Recursively yield all existing files (of any kind, including directories) matching the given relative pattern, anywhere in this subtree.

absolute()[source]

Return an absolute version of this path by prepending the current working directory. No normalization or symlink resolution is performed.

Use resolve() to get the canonical path to a file.

resolve(strict=False)[source]

Make the path absolute, resolving all symlinks on the way and also normalizing it.

stat(*, follow_symlinks=True)[source]

Return the result of the stat() system call on this path, like os.stat() does.

owner()[source]

Return the login name of the file owner.

group()[source]

Return the group name of the file gid.

open(mode='r', buffering=-1, encoding=None, errors=None, newline=None)[source]

Open the file pointed by this path and return a file object, as the built-in open() function does.

read_bytes()[source]

Open the file in bytes mode, read it, and close the file.

read_text(encoding=None, errors=None)[source]

Open the file in text mode, read it, and close the file.

write_bytes(data)[source]

Open the file in bytes mode, write to it, and close the file.

write_text(data, encoding=None, errors=None, newline=None)[source]

Open the file in text mode, write to it, and close the file.

Return the path to which the symbolic link points.

touch(mode=438, exist_ok=True)[source]

Create this file with the given access mode, if it doesn’t exist.

mkdir(mode=511, parents=False, exist_ok=False)[source]

Create a new directory at this given path.

chmod(mode, *, follow_symlinks=True)[source]

Change the permissions of the path, like os.chmod().

lchmod(mode)[source]

Like chmod(), except if the path points to a symlink, the symlink’s permissions are changed, rather than its target’s.

Remove this file or link. If the path is a directory, use rmdir() instead.

rmdir()[source]

Remove this directory. The directory must be empty.

lstat()[source]

Like stat(), except if the path points to a symlink, the symlink’s status information is returned, rather than its target’s.

rename(target)[source]

Rename this path to the target path.

The target path may be absolute or relative. Relative paths are interpreted relative to the current working directory, not the directory of the Path object.

Returns the new Path instance pointing to the target path.

replace(target)[source]

Rename this path to the target path, overwriting if that path exists.

The target path may be absolute or relative. Relative paths are interpreted relative to the current working directory, not the directory of the Path object.

Returns the new Path instance pointing to the target path.

Make this path a symlink pointing to the target path. Note the order of arguments (link, target) is the reverse of os.symlink.

Make this path a hard link pointing to the same file as target.

Note the order of arguments (self, target) is the reverse of os.link’s.

Make the target path a hard link pointing to this path.

Note this function does not make this path a hard link to target, despite the implication of the function and argument names. The order of arguments (target, link) is the reverse of Path.symlink_to, but matches that of os.link.

Deprecated since Python 3.10 and scheduled for removal in Python 3.12. Use hardlink_to() instead.

exists()[source]

Whether this path exists.

is_dir()[source]

Whether this path is a directory.

is_file()[source]

Whether this path is a regular file (also True for symlinks pointing to regular files).

is_mount()[source]

Check if this path is a POSIX mount point

Whether this path is a symbolic link.

is_block_device()[source]

Whether this path is a block device.

is_char_device()[source]

Whether this path is a character device.

is_fifo()[source]

Whether this path is a FIFO.

is_socket()[source]

Whether this path is a socket.

expanduser()[source]

Return a new path with expanded ~ and ~user constructs (as returned by os.path.expanduser)

class cryoblob.PreprocessingConfig(*args, **kwargs)[source]

Bases: BaseModel

Configuration model for image preprocessing parameters.

This validates all parameters used in the preprocessing function to ensure they are within valid ranges and types before being passed to JAX-compiled functions.

validate_sigma_values()

Ensure sigma values are reasonable for image processing.

validate_conflicting_options()

Ensure conflicting preprocessing options aren’t both enabled.

class Config[source]

Bases: object

frozen = True
extra = 'forbid'
class cryoblob.ValidationPipeline(*args, **kwargs)[source]

Bases: BaseModel

Main validation pipeline that combines all configuration models.

This provides a single entry point for validating complete processing configurations.

validate_for_single_image()[source]

Validate configuration for single image processing.

Returns:

  • - preprocessing_config (Validated preprocessing parameters)

  • - blob_config (Validated blob detection parameters)

Return type:

beartype.typing.Tuple.(<class ‘cryoblob.valid.PreprocessingConfig’>, <class ‘cryoblob.valid.BlobDetectionConfig’>)

validate_for_batch_processing()[source]

Validate configuration for batch file processing.

Returns:

  • - preprocessing_config (Validated preprocessing parameters)

  • - blob_config (Validated blob detection parameters)

  • - file_config (Validated file processing parameters)

Raises:

ValueError – If file_processing configuration is not provided:

Return type:

beartype.typing.Tuple.(<class ‘cryoblob.valid.PreprocessingConfig’>, <class ‘cryoblob.valid.BlobDetectionConfig’>, <class ‘cryoblob.valid.FileProcessingConfig’>)

validate_for_adaptive_processing()[source]

Validate configuration for adaptive filtering.

Returns:

  • - preprocessing_config (Validated preprocessing parameters)

  • - adaptive_config (Validated adaptive filtering parameters)

Raises:

ValueError – If adaptive_filtering configuration is not provided:

Return type:

beartype.typing.Tuple.(<class ‘cryoblob.valid.PreprocessingConfig’>, <class ‘cryoblob.valid.AdaptiveFilterConfig’>)

to_preprocessing_kwargs()[source]

Convert preprocessing config to kwargs dict for existing functions.

Returns:

- kwargs

Return type:

Dictionary compatible with existing preprocessing function

to_blob_kwargs()[source]

Convert blob detection config to kwargs dict for existing functions.

Returns:

- kwargs

Return type:

Dictionary compatible with existing blob_list_log function

class Config[source]

Bases: object

frozen = True
extra = 'forbid'
cryoblob.adaptive_threshold(img, target, initial_threshold=0.5, initial_slope=10.0, learning_rate=0.01, iterations=100)

Description

Adaptively optimizes thresholding parameters using gradient descent to produce a differentiably thresholded image.

param - img (Float[Array:

The input image to threshold.

param “h w”]):

The input image to threshold.

param - target (Float[Array:

A reference binary image for supervised parameter optimization.

param “h w”]):

A reference binary image for supervised parameter optimization.

param - initial_threshold (scalar_float:

Initial guess for the threshold parameter. Default is 0.5.

param optional):

Initial guess for the threshold parameter. Default is 0.5.

param - initial_slope (scalar_float:

Initial guess for the slope controlling sigmoid steepness. Default is 10.0.

param optional):

Initial guess for the slope controlling sigmoid steepness. Default is 10.0.

param - learning_rate (scalar_float:

The learning rate used during gradient optimization. Default is 0.01.

param optional):

The learning rate used during gradient optimization. Default is 0.01.

param - iterations (scalar_int:

Number of iterations for gradient optimization. Default is 100.

param optional):

Number of iterations for gradient optimization. Default is 100.

returns:
    • thresholded_img (Float[Array, “h w”]) – The image after differentiable thresholding using optimized parameters.

    • optimized_threshold (scalar_float) – The optimized threshold parameter.

    • optimized_slope (scalar_float) – The optimized slope parameter.

  • Flow

  • —-

    • sigmoid_threshold – Applies a sigmoid function to the input image.

    • threshold_loss_fn – Computes the loss between the thresholded image and the target.

    • step – Performs a single optimization step.

    • optimized_params – Optimizes threshold and slope parameters.

    • thresholded_img – Applies the optimized thresholding parameters to the

    input image.

cryoblob.adaptive_wiener(img, target, kernel_size=3, initial_noise=0.1, learning_rate=0.01, iterations=100)

Adaptive Wiener filter that optimizes the noise estimate using gradient descent.

Parameters:
  • (Float[Array (- target) – Noisy input image.

  • w"]) ("h) – Noisy input image.

  • (Float[Array – A target image or reference image used for optimization.

  • w"]) – A target image or reference image used for optimization.

  • Tuple[int (- kernel_size (scalar_int |) – Window size for Wiener filter. Default is 3.

  • int] – Window size for Wiener filter. Default is 3.

  • optional) – Window size for Wiener filter. Default is 3.

  • (scalar_float (- learning_rate) – Initial guess for noise parameter. Default is 0.1.

  • optional) – Initial guess for noise parameter. Default is 0.1.

  • (scalar_float – Learning rate for optimization. Default is 0.01.

  • optional) – Learning rate for optimization. Default is 0.01.

  • (scalar_int (- iterations) – Number of optimization steps. Default is 100.

  • optional) – Number of optimization steps. Default is 100.

Returns:

    • filtered_img (Float[Array, “h w”]) – Wiener filtered image with optimized noise parameter.

    • optimized_noise (scalar_float) – The optimized noise parameter.

Return type:

beartype.typing.Tuple.(jaxtyping.Float.(jaxtyping.Array, ‘h w’), beartype.typing.Union.(<class ‘float’>, jaxtyping.Float.(jaxtyping.Array, ‘’)))

cryoblob.apply_gaussian_blur(image, sigma=1.0, kernel_size=5, mode='same')

Description

Apply Gaussian blur to an image using convolution in JAX.

param - image (Real[Array:

Input image.

param “y x”]):

Input image.

param - sigma (scalar_float:

Standard deviation for Gaussian kernel. Defaults to 1.0.

param optional):

Standard deviation for Gaussian kernel. Defaults to 1.0.

param - kernel_size (scalar_int:

Size of Gaussian kernel. Must be odd. Defaults to 5.

param optional):

Size of Gaussian kernel. Must be odd. Defaults to 5.

param - mode (Literal[“full”:

Convolution mode. Defaults to “same”.

param “valid”:

Convolution mode. Defaults to “same”.

param “same”]):

Convolution mode. Defaults to “same”.

returns:

Blurred image.

rtype:
  • blurred (Float[Array, “yp xp”])

cryoblob.blob_list_log(mrc_image, min_blob_size=5, max_blob_size=20, blob_step=1, downscale=4, std_threshold=6)

Description

Detect blobs of varying sizes in an MRC image using the Laplacian of Gaussian (LoG) method.

param - mrc_image (MRC_Image):

The PyTree containing the image data and metadata.

param - min_blob_size (scalar_num:

Minimum blob size to detect. Defaults to 10.

param optional):

Minimum blob size to detect. Defaults to 10.

param - max_blob_size (scalar_num:

Maximum blob size to detect. Defaults to 100.

param optional):

Maximum blob size to detect. Defaults to 100.

param - blob_step (scalar_num:

Step size between consecutive blob scales. Defaults to 2.

param optional):

Step size between consecutive blob scales. Defaults to 2.

param - downscale (scalar_num:

Factor by which the image is downscaled before detection. Defaults to 4.

param optional):

Factor by which the image is downscaled before detection. Defaults to 4.

param - std_threshold (scalar_num:

Threshold in standard deviations for blob detection. Defaults to 6.

param optional):

Threshold in standard deviations for blob detection. Defaults to 6.

returns:

Array of blob coordinates and sizes, shape [n, 3]. Columns represent (Y, X, Blob size in pixels).

rtype:
  • scaled_coords (Float[Array, “n 3”])

cryoblob.center_of_mass_3d(image, labels, num_labels)

Description

Calculate center of mass for each labeled region in a 3D image.

param - image (Float[Array:

3D image array

param “x y z”]):

3D image array

param - labels (Integer[Array:

Integer array of labels

param “x y z”]):

Integer array of labels

param - num_labels (int):

Number of labels (excluding background)

returns:

Array of centroid coordinates for each label

rtype:
  • centroids (Float[Array, “n 3”])

cryoblob.create_default_pipeline()[source]

Create a validation pipeline with default settings.

cryoblob.create_fast_pipeline()[source]

Create a validation pipeline optimized for speed.

cryoblob.create_high_quality_pipeline()[source]

Create a validation pipeline optimized for high-quality blob detection.

cryoblob.difference_of_gaussians(image, sigma1, sigma2, sampling=1, hist_stretch=True, normalized=True)

Description

Applies Difference of Gaussians (DoG) filtering to enhance circular blobs.

param - image (Real[Array:

Input 2D image.

param “y x”]):

Input 2D image.

param - sigma1 (scalar_num):

Standard deviation of the first Gaussian (smaller).

param - sigma2 (scalar_num):

Standard deviation of the second Gaussian (larger).

param - sampling (scalar_num:

Downsampling factor; 1 means no resizing. Default is 1.

param optional):

Downsampling factor; 1 means no resizing. Default is 1.

param - hist_stretch (bool:

Apply histogram stretching if True. Default is True.

param optional):

Apply histogram stretching if True. Default is True.

param - normalized (bool:

Normalize filtered output by sigma2 if True. Default is True.

param optional):

Normalize filtered output by sigma2 if True. Default is True.

returns:
    • dog_filtered (Float[Array, “y x”]) – DoG-filtered image.

  • Flow

  • —-

    • Downsamples image if sampling ≠ 1 (JIT-safe way).

  • - Histogram stretch if requested.

  • - Create arithmetic-enforced DoG kernel.

  • - Convolve the image with DoG kernel.

  • - Normalize output if required.

cryoblob.equalize_adapthist(image, kernel_size=8, clip_limit=0.01, nbins=256)

Description

Perform Contrast Limited Adaptive Histogram Equalization (CLAHE).

param - image (Real[Array:

Input image.

param “h w”]):

Input image.

param - kernel_size (scalar_int:

Size of local regions for histogram equalization. Default is 8.

param optional):

Size of local regions for histogram equalization. Default is 8.

param - clip_limit (scalar_float:

Clipping limit for histogram. Higher values amplify contrast more strongly. Default is 0.01.

param optional):

Clipping limit for histogram. Higher values amplify contrast more strongly. Default is 0.01.

param - nbins (scalar_int:

Number of bins for the histogram. Default is 256.

param optional):

Number of bins for the histogram. Default is 256.

returns:

Image after applying CLAHE.

rtype:
  • equalized_final (Float[Array, “h w”])

Notes

CLAHE performs localized histogram equalization to improve image contrast without amplifying noise excessively. The algorithm:

  • Divides the image into small regions (tiles).

  • Performs local histogram equalization on each tile separately.

  • Clips histograms at the specified limit to prevent noise amplification.

  • Interpolates results to produce a smoothly equalized image.

cryoblob.equalize_hist(image, nbins=256, mask=None)

Description

Perform histogram equalization on an image using JAX.

param - image (Real[Array:

Input image to equalize

param “h w”]):

Input image to equalize

param - nbins (scalar_int:

Number of bins for histogram. Default is 256

param optional):

Number of bins for histogram. Default is 256

param - mask (Real[Array:

Optional mask for selective equalization. Default is None (use all pixels)

param “h w”]:

Optional mask for selective equalization. Default is None (use all pixels)

param optional):

Optional mask for selective equalization. Default is None (use all pixels)

returns:

Histogram equalized image

rtype:
  • equalized (Float[Array, “h w”])

cryoblob.estimate_batch_size(sample_file_path, target_memory_gb=4.0, safety_factor=0.7, processing_overhead=3.0)

Description

Estimate optimal batch size for processing MRC files based on available memory and file characteristics. This function analyzes a sample file to estimate memory requirements and calculates the maximum number of files that can be processed simultaneously without exceeding memory limits.

param - sample_file_path (str):

Path to a representative MRC file for size estimation

param - target_memory_gb (scalar_float:

Target GPU memory usage in GB. Default is 4.0

param optional):

Target GPU memory usage in GB. Default is 4.0

param - safety_factor (scalar_float:

Safety factor to prevent memory overflow (0.0-1.0). Default is 0.7 (use 70% of target memory)

param optional):

Safety factor to prevent memory overflow (0.0-1.0). Default is 0.7 (use 70% of target memory)

param - processing_overhead (scalar_float:

Memory overhead multiplier for processing operations. Default is 3.0 (processing uses 3x the raw data size)

param optional):

Memory overhead multiplier for processing operations. Default is 3.0 (processing uses 3x the raw data size)

returns:

Recommended batch size for processing

rtype:
  • batch_size (scalar_int)

Notes

The estimation considers: - Raw file size in memory (dtype conversion) - Preprocessing operations (filtering, transformations) - Blob detection memory requirements - JAX compilation overhead - Intermediate array storage

Memory estimation formula: ` per_file_memory = file_size * processing_overhead available_memory = target_memory_gb * safety_factor * 1e9 batch_size = max(1, available_memory // per_file_memory) `

Examples

>>> batch_size = estimate_batch_size("sample.mrc", target_memory_gb=8.0)
>>> print(f"Recommended batch size: {batch_size}")
cryoblob.estimate_memory_usage(file_path, include_preprocessing=True, include_blob_detection=True)

Description

Estimate memory usage in GB for processing a single MRC file.

param - file_path (str):

Path to MRC file

param - include_preprocessing (bool:

Include memory for preprocessing operations. Default is True

param optional):

Include memory for preprocessing operations. Default is True

param - include_blob_detection (bool:

Include memory for blob detection. Default is True

param optional):

Include memory for blob detection. Default is True

returns:

Estimated memory usage in GB

rtype:
  • memory_gb (scalar_float)

cryoblob.exponential_kernel(arr, k)

Description

Create an exponential kernel for image processing.

param - arr (Float[Array:

Input array

param “H W”]):

Input array

param - k (scalar_float):

Exponential decay constant

returns:

Exponential kernel

rtype:
  • kernel (Float[Array, “H W”])

cryoblob.file_params()

Description

Run this at the beginning to generate the dict This gives both the absolute and relative paths on how the files are organized.

returns:
    • main_directory (str) – the main directory where the package is located.

    • folder_structure (dict) – where the files and data are stored, as read

    from the organization.json file.

cryoblob.files(package)[source]

Get a Traversable resource from a package

cryoblob.find_connected_components(binary_image, connectivity=6)

Description

Pure JAX implementation of 3D connected components labeling. Uses a two-pass algorithm.

param - binary_image (Bool[Array:

Binary image where True/1 indicates foreground

param “x y z”]):

Binary image where True/1 indicates foreground

param - connectivity (int:

Either 6 (face-connected) or 26 (fully-connected). Default is 6

param optional):

Either 6 (face-connected) or 26 (fully-connected). Default is 6

returns:
    • labels (Integer[Array, “x y z”]) – Array where each connected component has unique integer label

    • num_labels (int) – Number of connected components found

cryoblob.find_particle_coords(results_3D, max_filtered, image_thresh)

Description

Find particle coordinates using connected components and center of mass. Pure JAX implementation.

param - results_3D (Float[Array:

3D array of filter responses

param “x y z”]):

3D array of filter responses

param - max_filtered (Float[Array:

Maximum filtered array

param “x y z”]):

Maximum filtered array

param - image_thresh (scalar_float):

Threshold for peak detection

returns:

Array of particle coordinates

rtype:
  • coords (Float[Array, “n 3”])

cryoblob.folder_blobs(folder_location, file_type='mrc', blob_downscale=7.0, target_memory_gb=4.0, stream_large_files=True, **kwargs)

Description

Process a folder of MRC images for blob detection with memory optimization and validated preprocessing configuration. Automatically manages batch processing and memory usage to prevent GPU memory overflow.

param - folder_location (str):

Path to folder containing MRC images to process

param - file_type (Literal[“mrc”]:

File extension to search for in the folder. Default is “mrc”

param optional):

File extension to search for in the folder. Default is “mrc”

param - blob_downscale (scalar_float:

Downscaling factor applied during blob detection. Default is 7.0

param optional):

Downscaling factor applied during blob detection. Default is 7.0

param - target_memory_gb (scalar_float:

Target GPU memory usage in GB for batch size optimization. Default is 4.0

param optional):

Target GPU memory usage in GB for batch size optimization. Default is 4.0

param - stream_large_files (bool:

Whether to use memory-mapped file access for large files. Default is True

param optional):

Whether to use memory-mapped file access for large files. Default is True

param - **kwargs:

Additional preprocessing parameters passed to PreprocessingConfig. Valid options: exponential, logarizer, gblur, background, apply_filter

returns:

DataFrame containing detected blob information with columns: [‘File Location’, ‘Center Y (nm)’, ‘Center X (nm)’, ‘Size (nm)’]

rtype:
  • blob_dataframe (pd.DataFrame)

raises ValueError::

If preprocessing parameters are invalid according to PreprocessingConfig validation

Notes

Memory Management: - Uses batch processing to control memory usage - Automatically adjusts batch size based on available memory - Clears device memory between batches - Streams large files if needed - Efficiently handles intermediate results

The function processes files in batches to prevent memory overflow and provides a progress bar to track processing status. Empty folders return an empty DataFrame with the expected column structure.

cryoblob.gaussian_kernel(size, sigma)

Description

Create a normalized 2D Gaussian kernel.

param - size (scalar_int):

Kernel size (size x size). Must be odd.

param - sigma (scalar_float):

Standard deviation of the Gaussian distribution.

returns:

Normalized 2D Gaussian kernel.

rtype:
  • kernel (Float[Array, “size size”])

cryoblob.get_optimal_batch_size(file_list, target_memory_gb=4.0, sample_fraction=0.1)

Description

Get optimal batch size by sampling multiple files from the list.

param - file_list (list[str]):

List of file paths to process

param - target_memory_gb (scalar_float:

Target memory usage in GB. Default is 4.0

param optional):

Target memory usage in GB. Default is 4.0

param - sample_fraction (scalar_float:

Fraction of files to sample for estimation. Default is 0.1

param optional):

Fraction of files to sample for estimation. Default is 0.1

returns:

Optimal batch size

rtype:
  • batch_size (scalar_int)

cryoblob.histogram(image, bins=256, range_limits=None)

Calculate histogram from input image data.

Parameters:
  • (Real[Array (- image) – Input array (any shape), flattened internally.

  • "..."]) – Input array (any shape), flattened internally.

  • (scalar_int (- bins) – Number of histogram bins.

  • optional) – Number of histogram bins.

  • (Tuple[scalar_float (- range_limits) – Min and max range for bins.

  • scalar_float] – Min and max range for bins.

  • optional) – Min and max range for bins.

Returns:

Histogram counts per bin.

Return type:

  • hist (Num[Array, “bins”])

cryoblob.image_resizer(orig_image, new_sampling)

Description

Resize an image using a fast resizing algorithm implemented in JAX. If a 3D stack is provided, the function will sum along the last dimension.

param - orig_image (Real[Array:

The original image to be resized. It should be a 2D JAX array or 3D stack.

param “y x”] | Real[Array:

The original image to be resized. It should be a 2D JAX array or 3D stack.

param “y x c”]):

The original image to be resized. It should be a 2D JAX array or 3D stack.

param - new_sampling (scalar_num | Real[Array:

The new sampling rate for resizing the image. It can be a single float value or a tuple of two float values representing the sampling rates for the x and y axes respectively. - If a single value is provided, it will be applied to both axes. - If new_sampling is greater than 1, the image will be downsampled. - If new_sampling is less than 1, the image will be upsampled.

param “2”]):

The new sampling rate for resizing the image. It can be a single float value or a tuple of two float values representing the sampling rates for the x and y axes respectively. - If a single value is provided, it will be applied to both axes. - If new_sampling is greater than 1, the image will be downsampled. - If new_sampling is less than 1, the image will be upsampled.

returns:

The resized image.

rtype:
  • resampled_image (Float[Array, “a b”])

cryoblob.load_mrc(filepath)[source]

Description

Reads an MRC-format cryo-EM file from the specified path, extracting image data and relevant metadata. All numeric data are converted into JAX arrays and wrapped into a structured MRC_Image PyTree, compatible with JAX’s functional programming paradigm.

param - filepath (str):

Path to the MRC file to be loaded.

returns:
  • image_data: Image array (2D or 3D).

  • voxel_size: Array containing voxel dimensions in

    Å (Z, Y, X).

  • origin: Array indicating the origin coordinates from the

    header (Z, Y, X).

  • data_min: Minimum pixel value.

  • data_max: Maximum pixel value.

  • data_mean: Mean pixel value.

  • mode: Integer code representing data type

    (e.g., 0=int8, 1=int16, 2=float32).

rtype:

MRC_Image (A PyTree containing)

Examples

>>> mrc_image = load_mrc("example.mrc")
>>> print(mrc_image.voxel_size)
Array([1.2, 1.2, 1.2], dtype=float32)

Notes

  • This function uses the mrcfile library for parsing MRC files.

  • The resulting PyTree structure (MRC_Image) is explicitly

    designed for use in JAX-based image processing pipelines.

cryoblob.log_kernel(size, sigma, kernel_min=3)

Description

Create a Laplacian of Gaussian kernel for edge detection.

param - size (int):

Kernel size, enforced positive and odd for ‘gaussian’ mode.

param - sigma (scalar_float):

Gaussian standard deviation for LoG kernel.

param - kernel_min (int:

Maximum kernel size (default is 3). This is used to enforce minimum kernel size.

param optional):

Maximum kernel size (default is 3). This is used to enforce minimum kernel size.

returns:

Laplacian kernel.

rtype:
  • kernel (Float[Array, “size size”])

cryoblob.make_MRC_Image(image_data, voxel_size, origin, data_min, data_max, data_mean, mode)

Description

Factory function to create an MRC_Image instance.

param - image_data (Num[Array:

The image data array from the MRC file. Can be 2D or 3D.

param “H W”] | Num[Array:

The image data array from the MRC file. Can be 2D or 3D.

param “D H W”]):

The image data array from the MRC file. Can be 2D or 3D.

param - voxel_size (Float[Array:

Voxel size in the order (Z, Y, X).

param “3”]):

Voxel size in the order (Z, Y, X).

param - origin (Float[Array:

Origin coordinates from the MRC file header (Z, Y, X).

param “3”]):

Origin coordinates from the MRC file header (Z, Y, X).

param - data_min (scalar_float):

Minimum value of image data (as stored in header).

param - data_max (scalar_float):

Maximum value of image data (as stored in header).

param - data_mean (scalar_float):

Mean value of image data (as stored in header).

param - mode (scalar_int):

Data type mode from MRC header (e.g., 0: int8, 2: float32).

returns:

An instance of the MRC_Image PyTree structure.

rtype:
  • MRC_Image

cryoblob.perona_malik(image, num_iter, kappa, gamma=0.1, conduction_fn=jaxtyping.jaxtyped)

Perform edge-preserving denoising using the Perona-Malik anisotropic diffusion.

Parameters:
  • (Float[Array (- image) – Input noisy image.

  • W"]) ("H) – Input noisy image.

  • (scalar_int) (- num_iter) – Number of diffusion iterations.

  • (scalar_float) (- kappa) – Conductance coefficient controlling sensitivity to edges.

  • (scalar_float (- gamma) – Diffusion rate (0 < gamma <= 0.25 for stability), default is 0.1.

  • optional) – Diffusion rate (0 < gamma <= 0.25 for stability), default is 0.1.

  • (Callable (- conduction_fn) – Conductivity function, defaults to exponential.

  • optional) – Conductivity function, defaults to exponential.

Returns:

Edge-preserved denoised image.

Return type:

  • denoised_image (Float[Array, “H W”])

Notes

The Perona-Malik equation is given by: u_t = gamma * div(c * grad(u)) + u where: - u is the input image - t is time - gamma is the diffusion rate - c is the conductivity function - grad is the gradient operator - div is the divergence operator

The conductivity function c is typically an exponential function: c(delta) = exp(-delta^2 / kappa^2) where delta is the difference between neighboring pixels.

Perona, Pietro, Takahiro Shiota, and Jitendra Malik. “Anisotropic diffusion.” Geometry-driven diffusion in computer vision (1994): 73-92.

cryoblob.plot_mrc(mrc_image, image_size=(15, 15), cmap='magma', mode='plain')

Description

Plot an MRC image using Matplotlib with an optional scaling mode and scalebar.

param - mrc_image (MRC_Image):

The PyTree structure containing image data and voxel metadata.

param - image_size (Tuple[scalar_int:

Size of the plotted figure (width, height) in inches. Default is (15, 15).

param scalar_int]:

Size of the plotted figure (width, height) in inches. Default is (15, 15).

param optional):

Size of the plotted figure (width, height) in inches. Default is (15, 15).

param - cmap (str:

The Matplotlib colormap to use. Default is “viridis”.

param optional):

The Matplotlib colormap to use. Default is “viridis”.

param - mode (str:

Mode of visualization: - “plain”: Plot image data without modifications. - “log”: Plot logarithmically scaled image data. - “exp”: Plot exponentially scaled image data. Default is “plain”.

param optional):

Mode of visualization: - “plain”: Plot image data without modifications. - “log”: Plot logarithmically scaled image data. - “exp”: Plot exponentially scaled image data. Default is “plain”.

returns:

Displays the plot.

rtype:

None

Examples

>>> plot_mrc(mrc_image, image_size=(10, 10), cmap="viridis", mode="log")
cryoblob.preprocessing(image_orig, return_params=False, exponential=True, logarizer=False, gblur=2, background=0, apply_filter=0)

Description

Pre-processing of low SNR images to improve contrast of blobs.

param - image_orig (Float[Array:

An input image represented as a 2D JAX array.

param “y x”]):

An input image represented as a 2D JAX array.

param - return_params (bool:

A boolean indicating whether to return the processing parameters. Default is False.

param optional):

A boolean indicating whether to return the processing parameters. Default is False.

param - exponential (bool:

A boolean indicating whether to apply an exponential function to the image. Default is True.

param optional):

A boolean indicating whether to apply an exponential function to the image. Default is True.

param - logarizer (bool:

A boolean indicating whether to apply a log function to the image. Default is False.

param optional):

A boolean indicating whether to apply a log function to the image. Default is False.

param - gblur (int:

The standard deviation of the Gaussian filter. Default is 2.

param optional):

The standard deviation of the Gaussian filter. Default is 2.

param - background (int:

The standard deviation of the Gaussian filter for background subtraction. Default is 0.

param optional):

The standard deviation of the Gaussian filter for background subtraction. Default is 0.

param - apply_filter (int:

If greater than 1, a Wiener filter is applied to the image.

param optional):

If greater than 1, a Wiener filter is applied to the image.

returns:

The pre-processed image

rtype:
  • image_proc (Float[Array, “y x”])

cryoblob.process_batch_of_files(file_batch, preprocessing_config, blob_downscale)

Process a batch of files in parallel with memory optimization.

Parameters:
  • (List[str]) (- file_batch) – List of file paths to process

  • (Dict) (- preprocessing_kwargs) – Preprocessing parameters

  • (float) (- blob_downscale) – Downscaling factor

Returns:

List of (blobs, file_path) tuples

Return type:

  • results (List[Tuple[Array, str]])

cryoblob.process_single_file(file_path, preprocessing_config, blob_downscale, stream_mode=True)

Description

Process a single MRC file for blob detection with memory optimization and validated preprocessing configuration.

param - file_path (str):

Path to the MRC image file to process

param - preprocessing_config (PreprocessingConfig):

Validated preprocessing configuration containing all processing parameters

param - blob_downscale (scalar_float):

Downscaling factor applied during blob detection to reduce computational load

param - stream_mode (bool:

Whether to use memory-mapped file access for large files to reduce memory usage. Default is True

param optional):

Whether to use memory-mapped file access for large files to reduce memory usage. Default is True

returns:
    • scaled_blobs (Float[Array, “n 3”]) – Array of detected blob coordinates and sizes where each row contains

    [Y_position_nm, X_position_nm, Size_nm]

    • file_path (str) – Original file path for tracking processed files

raises Exception::

Returns empty array and original file path if processing fails, with error message printed to console

Notes

The function uses streaming mode for large files to reduce memory usage and immediately releases file handles after reading. All intermediate arrays are explicitly deleted to manage GPU memory efficiently.

cryoblob.resize_x(x_image, new_x_len)

Description

Resize image along y-axis by independently resampling each column. Uses lax.scan over the y-dimension, then vmap over x-dimension.

param - x_image (Num[Array:

Image to resize (y by x)

param “y x”]):

Image to resize (y by x)

param - new_x_len (scalar_int):

Target number of columns

returns:

Resized image (new_y by x)

rtype:
  • resized (Float[Array, “y new_x”])

cryoblob.validate_mrc_metadata(voxel_size, origin, data_min, data_max, data_mean, mode, image_shape)[source]

Validate MRC metadata and return validated model.

Parameters:
  • voxel_size (-)

  • origin (-)

  • data_min (-)

  • data_max (-)

  • data_mean (-)

  • mode (-)

  • image_shape (-)

Returns:

- metadata

Return type:

Validated MRC metadata model

Raises:

ValidationError – If any metadata values are invalid:

cryoblob.wiener(img, kernel_size=3, noise=None)

Description

JAX implementation of Wiener filter for noise reduction. This is similar to scipy.signal.wiener.

param - img (Float[Array:

The input image to be filtered

param “h w”]):

The input image to be filtered

param - kernel_size (int or tuple:

The size of the sliding window for local statistics. If tuple, represents (height, width). Default is 3

param optional):

The size of the sliding window for local statistics. If tuple, represents (height, width). Default is 3

param - noise (scalar_float:

The noise power. If None, uses the average of the local variance. Default is None

param optional):

The noise power. If None, uses the average of the local variance. Default is None

returns:

The filtered output with the same shape as input

rtype:
  • filtered (Float[Array, “h w”])

Notes

The Wiener filter is optimal in terms of the mean square error. It estimates the local mean and variance around each pixel.

Adaptive Processing Module

Module: adapt

Contains adaptive image processing methods that take advantage of JAX’s automatic differentiation capabilities.

Functions

  • adaptive_wiener:

    Adaptive Wiener filter that optimizes the noise estimate using gradient descent.

  • adaptive_threshold:

    Adaptively optimizes thresholding parameters using gradient descent to produce a differentiably thresholded image.

cryoblob.adapt.adaptive_wiener(img, target, kernel_size=3, initial_noise=0.1, learning_rate=0.01, iterations=100)

Adaptive Wiener filter that optimizes the noise estimate using gradient descent.

Parameters:
  • (Float[Array (- target) – Noisy input image.

  • w"]) ("h) – Noisy input image.

  • (Float[Array – A target image or reference image used for optimization.

  • w"]) – A target image or reference image used for optimization.

  • Tuple[int (- kernel_size (scalar_int |) – Window size for Wiener filter. Default is 3.

  • int] – Window size for Wiener filter. Default is 3.

  • optional) – Window size for Wiener filter. Default is 3.

  • (scalar_float (- learning_rate) – Initial guess for noise parameter. Default is 0.1.

  • optional) – Initial guess for noise parameter. Default is 0.1.

  • (scalar_float – Learning rate for optimization. Default is 0.01.

  • optional) – Learning rate for optimization. Default is 0.01.

  • (scalar_int (- iterations) – Number of optimization steps. Default is 100.

  • optional) – Number of optimization steps. Default is 100.

Returns:

    • filtered_img (Float[Array, “h w”]) – Wiener filtered image with optimized noise parameter.

    • optimized_noise (scalar_float) – The optimized noise parameter.

Return type:

beartype.typing.Tuple.(jaxtyping.Float.(jaxtyping.Array, ‘h w’), beartype.typing.Union.(<class ‘float’>, jaxtyping.Float.(jaxtyping.Array, ‘’)))

cryoblob.adapt.adaptive_threshold(img, target, initial_threshold=0.5, initial_slope=10.0, learning_rate=0.01, iterations=100)

Description

Adaptively optimizes thresholding parameters using gradient descent to produce a differentiably thresholded image.

param - img (Float[Array:

The input image to threshold.

param “h w”]):

The input image to threshold.

param - target (Float[Array:

A reference binary image for supervised parameter optimization.

param “h w”]):

A reference binary image for supervised parameter optimization.

param - initial_threshold (scalar_float:

Initial guess for the threshold parameter. Default is 0.5.

param optional):

Initial guess for the threshold parameter. Default is 0.5.

param - initial_slope (scalar_float:

Initial guess for the slope controlling sigmoid steepness. Default is 10.0.

param optional):

Initial guess for the slope controlling sigmoid steepness. Default is 10.0.

param - learning_rate (scalar_float:

The learning rate used during gradient optimization. Default is 0.01.

param optional):

The learning rate used during gradient optimization. Default is 0.01.

param - iterations (scalar_int:

Number of iterations for gradient optimization. Default is 100.

param optional):

Number of iterations for gradient optimization. Default is 100.

returns:
    • thresholded_img (Float[Array, “h w”]) – The image after differentiable thresholding using optimized parameters.

    • optimized_threshold (scalar_float) – The optimized threshold parameter.

    • optimized_slope (scalar_float) – The optimized slope parameter.

  • Flow

  • —-

    • sigmoid_threshold – Applies a sigmoid function to the input image.

    • threshold_loss_fn – Computes the loss between the thresholded image and the target.

    • step – Performs a single optimization step.

    • optimized_params – Optimizes threshold and slope parameters.

    • thresholded_img – Applies the optimized thresholding parameters to the

    input image.

Blob Detection Module

Module: blobs

Codes for actually detecting the blobs. The image processing and data I/O files are kept separately. This just deals with preprocessing data and counting blobs.

Functions

  • find_connected_components:

    Pure JAX implementation of 3D connected components labeling.

  • center_of_mass_3d:

    Calculate center of mass for each labeled region in a 3D image.

  • find_particle_coords:

    Find particle coordinates using connected components and center of mass.

  • preprocessing:

    Pre-processes low SNR images to improve contrast of blobs.

  • blob_list_log:

    Detects blobs in an input image using the Laplacian of Gaussian (LoG) method.

cryoblob.blobs.find_connected_components(binary_image, connectivity=6)

Description

Pure JAX implementation of 3D connected components labeling. Uses a two-pass algorithm.

param - binary_image (Bool[Array:

Binary image where True/1 indicates foreground

param “x y z”]):

Binary image where True/1 indicates foreground

param - connectivity (int:

Either 6 (face-connected) or 26 (fully-connected). Default is 6

param optional):

Either 6 (face-connected) or 26 (fully-connected). Default is 6

returns:
    • labels (Integer[Array, “x y z”]) – Array where each connected component has unique integer label

    • num_labels (int) – Number of connected components found

cryoblob.blobs.center_of_mass_3d(image, labels, num_labels)

Description

Calculate center of mass for each labeled region in a 3D image.

param - image (Float[Array:

3D image array

param “x y z”]):

3D image array

param - labels (Integer[Array:

Integer array of labels

param “x y z”]):

Integer array of labels

param - num_labels (int):

Number of labels (excluding background)

returns:

Array of centroid coordinates for each label

rtype:
  • centroids (Float[Array, “n 3”])

cryoblob.blobs.find_particle_coords(results_3D, max_filtered, image_thresh)

Description

Find particle coordinates using connected components and center of mass. Pure JAX implementation.

param - results_3D (Float[Array:

3D array of filter responses

param “x y z”]):

3D array of filter responses

param - max_filtered (Float[Array:

Maximum filtered array

param “x y z”]):

Maximum filtered array

param - image_thresh (scalar_float):

Threshold for peak detection

returns:

Array of particle coordinates

rtype:
  • coords (Float[Array, “n 3”])

cryoblob.blobs.preprocessing(image_orig, return_params=False, exponential=True, logarizer=False, gblur=2, background=0, apply_filter=0)

Description

Pre-processing of low SNR images to improve contrast of blobs.

param - image_orig (Float[Array:

An input image represented as a 2D JAX array.

param “y x”]):

An input image represented as a 2D JAX array.

param - return_params (bool:

A boolean indicating whether to return the processing parameters. Default is False.

param optional):

A boolean indicating whether to return the processing parameters. Default is False.

param - exponential (bool:

A boolean indicating whether to apply an exponential function to the image. Default is True.

param optional):

A boolean indicating whether to apply an exponential function to the image. Default is True.

param - logarizer (bool:

A boolean indicating whether to apply a log function to the image. Default is False.

param optional):

A boolean indicating whether to apply a log function to the image. Default is False.

param - gblur (int:

The standard deviation of the Gaussian filter. Default is 2.

param optional):

The standard deviation of the Gaussian filter. Default is 2.

param - background (int:

The standard deviation of the Gaussian filter for background subtraction. Default is 0.

param optional):

The standard deviation of the Gaussian filter for background subtraction. Default is 0.

param - apply_filter (int:

If greater than 1, a Wiener filter is applied to the image.

param optional):

If greater than 1, a Wiener filter is applied to the image.

returns:

The pre-processed image

rtype:
  • image_proc (Float[Array, “y x”])

cryoblob.blobs.blob_list_log(mrc_image, min_blob_size=5, max_blob_size=20, blob_step=1, downscale=4, std_threshold=6)

Description

Detect blobs of varying sizes in an MRC image using the Laplacian of Gaussian (LoG) method.

param - mrc_image (MRC_Image):

The PyTree containing the image data and metadata.

param - min_blob_size (scalar_num:

Minimum blob size to detect. Defaults to 10.

param optional):

Minimum blob size to detect. Defaults to 10.

param - max_blob_size (scalar_num:

Maximum blob size to detect. Defaults to 100.

param optional):

Maximum blob size to detect. Defaults to 100.

param - blob_step (scalar_num:

Step size between consecutive blob scales. Defaults to 2.

param optional):

Step size between consecutive blob scales. Defaults to 2.

param - downscale (scalar_num:

Factor by which the image is downscaled before detection. Defaults to 4.

param optional):

Factor by which the image is downscaled before detection. Defaults to 4.

param - std_threshold (scalar_num:

Threshold in standard deviations for blob detection. Defaults to 6.

param optional):

Threshold in standard deviations for blob detection. Defaults to 6.

returns:

Array of blob coordinates and sizes, shape [n, 3]. Columns represent (Y, X, Blob size in pixels).

rtype:
  • scaled_coords (Float[Array, “n 3”])

File I/O Module

Module: files

Contains the codes for interfacing with data files. One goal here is to separate the Python code from the JAX code. Thus most of the necessary outward facing code, which is necessarily in Python, is here.

Functions

  • file_params:

    Get the parameters for the file organization.

  • load_mrc:

    Reads an MRC-format cryo-EM file, extracting image data and metadata.

  • process_single_file:

    Process a single file for blob detection with memory optimization.

  • process_batch_of_files:

    Process a batch of files in parallel with memory optimization.

  • folder_blobs:

    Process a folder of images for blob detection with memory optimization.

  • estimate_batch_size:

    Estimate optimal batch size for processing MRC files based on available memory.

  • estimate_memory_usage:

    Estimate memory usage in GB for processing a single MRC file.

  • get_optimal_batch_size:

    Get optimal batch size by sampling multiple files from the list.

cryoblob.files.file_params()

Description

Run this at the beginning to generate the dict This gives both the absolute and relative paths on how the files are organized.

returns:
    • main_directory (str) – the main directory where the package is located.

    • folder_structure (dict) – where the files and data are stored, as read

    from the organization.json file.

cryoblob.files.load_mrc(filepath)[source]

Description

Reads an MRC-format cryo-EM file from the specified path, extracting image data and relevant metadata. All numeric data are converted into JAX arrays and wrapped into a structured MRC_Image PyTree, compatible with JAX’s functional programming paradigm.

param - filepath (str):

Path to the MRC file to be loaded.

returns:
  • image_data: Image array (2D or 3D).

  • voxel_size: Array containing voxel dimensions in

    Å (Z, Y, X).

  • origin: Array indicating the origin coordinates from the

    header (Z, Y, X).

  • data_min: Minimum pixel value.

  • data_max: Maximum pixel value.

  • data_mean: Mean pixel value.

  • mode: Integer code representing data type

    (e.g., 0=int8, 1=int16, 2=float32).

rtype:

MRC_Image (A PyTree containing)

Examples

>>> mrc_image = load_mrc("example.mrc")
>>> print(mrc_image.voxel_size)
Array([1.2, 1.2, 1.2], dtype=float32)

Notes

  • This function uses the mrcfile library for parsing MRC files.

  • The resulting PyTree structure (MRC_Image) is explicitly

    designed for use in JAX-based image processing pipelines.

cryoblob.files.process_single_file(file_path, preprocessing_config, blob_downscale, stream_mode=True)

Description

Process a single MRC file for blob detection with memory optimization and validated preprocessing configuration.

param - file_path (str):

Path to the MRC image file to process

param - preprocessing_config (PreprocessingConfig):

Validated preprocessing configuration containing all processing parameters

param - blob_downscale (scalar_float):

Downscaling factor applied during blob detection to reduce computational load

param - stream_mode (bool:

Whether to use memory-mapped file access for large files to reduce memory usage. Default is True

param optional):

Whether to use memory-mapped file access for large files to reduce memory usage. Default is True

returns:
    • scaled_blobs (Float[Array, “n 3”]) – Array of detected blob coordinates and sizes where each row contains

    [Y_position_nm, X_position_nm, Size_nm]

    • file_path (str) – Original file path for tracking processed files

raises Exception::

Returns empty array and original file path if processing fails, with error message printed to console

Notes

The function uses streaming mode for large files to reduce memory usage and immediately releases file handles after reading. All intermediate arrays are explicitly deleted to manage GPU memory efficiently.

cryoblob.files.process_batch_of_files(file_batch, preprocessing_config, blob_downscale)

Process a batch of files in parallel with memory optimization.

Parameters:
  • (List[str]) (- file_batch) – List of file paths to process

  • (Dict) (- preprocessing_kwargs) – Preprocessing parameters

  • (float) (- blob_downscale) – Downscaling factor

Returns:

List of (blobs, file_path) tuples

Return type:

  • results (List[Tuple[Array, str]])

cryoblob.files.folder_blobs(folder_location, file_type='mrc', blob_downscale=7.0, target_memory_gb=4.0, stream_large_files=True, **kwargs)

Description

Process a folder of MRC images for blob detection with memory optimization and validated preprocessing configuration. Automatically manages batch processing and memory usage to prevent GPU memory overflow.

param - folder_location (str):

Path to folder containing MRC images to process

param - file_type (Literal[“mrc”]:

File extension to search for in the folder. Default is “mrc”

param optional):

File extension to search for in the folder. Default is “mrc”

param - blob_downscale (scalar_float:

Downscaling factor applied during blob detection. Default is 7.0

param optional):

Downscaling factor applied during blob detection. Default is 7.0

param - target_memory_gb (scalar_float:

Target GPU memory usage in GB for batch size optimization. Default is 4.0

param optional):

Target GPU memory usage in GB for batch size optimization. Default is 4.0

param - stream_large_files (bool:

Whether to use memory-mapped file access for large files. Default is True

param optional):

Whether to use memory-mapped file access for large files. Default is True

param - **kwargs:

Additional preprocessing parameters passed to PreprocessingConfig. Valid options: exponential, logarizer, gblur, background, apply_filter

returns:

DataFrame containing detected blob information with columns: [‘File Location’, ‘Center Y (nm)’, ‘Center X (nm)’, ‘Size (nm)’]

rtype:
  • blob_dataframe (pd.DataFrame)

raises ValueError::

If preprocessing parameters are invalid according to PreprocessingConfig validation

Notes

Memory Management: - Uses batch processing to control memory usage - Automatically adjusts batch size based on available memory - Clears device memory between batches - Streams large files if needed - Efficiently handles intermediate results

The function processes files in batches to prevent memory overflow and provides a progress bar to track processing status. Empty folders return an empty DataFrame with the expected column structure.

cryoblob.files.estimate_batch_size(sample_file_path, target_memory_gb=4.0, safety_factor=0.7, processing_overhead=3.0)

Description

Estimate optimal batch size for processing MRC files based on available memory and file characteristics. This function analyzes a sample file to estimate memory requirements and calculates the maximum number of files that can be processed simultaneously without exceeding memory limits.

param - sample_file_path (str):

Path to a representative MRC file for size estimation

param - target_memory_gb (scalar_float:

Target GPU memory usage in GB. Default is 4.0

param optional):

Target GPU memory usage in GB. Default is 4.0

param - safety_factor (scalar_float:

Safety factor to prevent memory overflow (0.0-1.0). Default is 0.7 (use 70% of target memory)

param optional):

Safety factor to prevent memory overflow (0.0-1.0). Default is 0.7 (use 70% of target memory)

param - processing_overhead (scalar_float:

Memory overhead multiplier for processing operations. Default is 3.0 (processing uses 3x the raw data size)

param optional):

Memory overhead multiplier for processing operations. Default is 3.0 (processing uses 3x the raw data size)

returns:

Recommended batch size for processing

rtype:
  • batch_size (scalar_int)

Notes

The estimation considers: - Raw file size in memory (dtype conversion) - Preprocessing operations (filtering, transformations) - Blob detection memory requirements - JAX compilation overhead - Intermediate array storage

Memory estimation formula: ` per_file_memory = file_size * processing_overhead available_memory = target_memory_gb * safety_factor * 1e9 batch_size = max(1, available_memory // per_file_memory) `

Examples

>>> batch_size = estimate_batch_size("sample.mrc", target_memory_gb=8.0)
>>> print(f"Recommended batch size: {batch_size}")
cryoblob.files.estimate_memory_usage(file_path, include_preprocessing=True, include_blob_detection=True)

Description

Estimate memory usage in GB for processing a single MRC file.

param - file_path (str):

Path to MRC file

param - include_preprocessing (bool:

Include memory for preprocessing operations. Default is True

param optional):

Include memory for preprocessing operations. Default is True

param - include_blob_detection (bool:

Include memory for blob detection. Default is True

param optional):

Include memory for blob detection. Default is True

returns:

Estimated memory usage in GB

rtype:
  • memory_gb (scalar_float)

cryoblob.files.get_optimal_batch_size(file_list, target_memory_gb=4.0, sample_fraction=0.1)

Description

Get optimal batch size by sampling multiple files from the list.

param - file_list (list[str]):

List of file paths to process

param - target_memory_gb (scalar_float:

Target memory usage in GB. Default is 4.0

param optional):

Target memory usage in GB. Default is 4.0

param - sample_fraction (scalar_float:

Fraction of files to sample for estimation. Default is 0.1

param optional):

Fraction of files to sample for estimation. Default is 0.1

returns:

Optimal batch size

rtype:
  • batch_size (scalar_int)

Image Processing Module

Module: image

Contains the basic functions for image processing, including resizing, filtering. This module will be used for data preprocessing.

Functions:

  • image_resizer:

    Resize an image using a fast resizing algorithm implemented in JAX.

  • resize_x:

    Resize image along y-axis by independently resampling each column.

  • gaussian_kernel:

    Create a normalized 2D Gaussian kernel.

  • apply_gaussian_blur:

    Apply Gaussian blur to an image using convolution in JAX.

  • difference_of_gaussians:

    Applies Difference of Gaussians (DoG) filtering to enhance circular blobs.

  • laplacian_of_gaussian:

    Applies Laplacian of Gaussian (LoG) filtering to an input image.

  • laplacian_kernel:

    Create a Laplacian kernel for edge detection in a JAX-compatible manner.

  • exponential_kernel:

    Create an exponential kernel for image processing.

  • perona_malik:

    Perform edge-preserving denoising using the Perona-Malik anisotropic diffusion.

  • histogram:

    Calculate the histogram of an image.

  • equalize_hist:

    Perform histogram equalization on an image using JAX.

  • equalize_adapthist:

    Perform adaptive histogram equalization on an image using JAX.

  • wiener:

    Perform Wiener filtering on an image using JAX.

cryoblob.image.image_resizer(orig_image, new_sampling)

Description

Resize an image using a fast resizing algorithm implemented in JAX. If a 3D stack is provided, the function will sum along the last dimension.

param - orig_image (Real[Array:

The original image to be resized. It should be a 2D JAX array or 3D stack.

param “y x”] | Real[Array:

The original image to be resized. It should be a 2D JAX array or 3D stack.

param “y x c”]):

The original image to be resized. It should be a 2D JAX array or 3D stack.

param - new_sampling (scalar_num | Real[Array:

The new sampling rate for resizing the image. It can be a single float value or a tuple of two float values representing the sampling rates for the x and y axes respectively. - If a single value is provided, it will be applied to both axes. - If new_sampling is greater than 1, the image will be downsampled. - If new_sampling is less than 1, the image will be upsampled.

param “2”]):

The new sampling rate for resizing the image. It can be a single float value or a tuple of two float values representing the sampling rates for the x and y axes respectively. - If a single value is provided, it will be applied to both axes. - If new_sampling is greater than 1, the image will be downsampled. - If new_sampling is less than 1, the image will be upsampled.

returns:

The resized image.

rtype:
  • resampled_image (Float[Array, “a b”])

cryoblob.image.resize_x(x_image, new_x_len)

Description

Resize image along y-axis by independently resampling each column. Uses lax.scan over the y-dimension, then vmap over x-dimension.

param - x_image (Num[Array:

Image to resize (y by x)

param “y x”]):

Image to resize (y by x)

param - new_x_len (scalar_int):

Target number of columns

returns:

Resized image (new_y by x)

rtype:
  • resized (Float[Array, “y new_x”])

cryoblob.image.gaussian_kernel(size, sigma)

Description

Create a normalized 2D Gaussian kernel.

param - size (scalar_int):

Kernel size (size x size). Must be odd.

param - sigma (scalar_float):

Standard deviation of the Gaussian distribution.

returns:

Normalized 2D Gaussian kernel.

rtype:
  • kernel (Float[Array, “size size”])

cryoblob.image.apply_gaussian_blur(image, sigma=1.0, kernel_size=5, mode='same')

Description

Apply Gaussian blur to an image using convolution in JAX.

param - image (Real[Array:

Input image.

param “y x”]):

Input image.

param - sigma (scalar_float:

Standard deviation for Gaussian kernel. Defaults to 1.0.

param optional):

Standard deviation for Gaussian kernel. Defaults to 1.0.

param - kernel_size (scalar_int:

Size of Gaussian kernel. Must be odd. Defaults to 5.

param optional):

Size of Gaussian kernel. Must be odd. Defaults to 5.

param - mode (Literal[“full”:

Convolution mode. Defaults to “same”.

param “valid”:

Convolution mode. Defaults to “same”.

param “same”]):

Convolution mode. Defaults to “same”.

returns:

Blurred image.

rtype:
  • blurred (Float[Array, “yp xp”])

cryoblob.image.difference_of_gaussians(image, sigma1, sigma2, sampling=1, hist_stretch=True, normalized=True)

Description

Applies Difference of Gaussians (DoG) filtering to enhance circular blobs.

param - image (Real[Array:

Input 2D image.

param “y x”]):

Input 2D image.

param - sigma1 (scalar_num):

Standard deviation of the first Gaussian (smaller).

param - sigma2 (scalar_num):

Standard deviation of the second Gaussian (larger).

param - sampling (scalar_num:

Downsampling factor; 1 means no resizing. Default is 1.

param optional):

Downsampling factor; 1 means no resizing. Default is 1.

param - hist_stretch (bool:

Apply histogram stretching if True. Default is True.

param optional):

Apply histogram stretching if True. Default is True.

param - normalized (bool:

Normalize filtered output by sigma2 if True. Default is True.

param optional):

Normalize filtered output by sigma2 if True. Default is True.

returns:
    • dog_filtered (Float[Array, “y x”]) – DoG-filtered image.

  • Flow

  • —-

    • Downsamples image if sampling ≠ 1 (JIT-safe way).

  • - Histogram stretch if requested.

  • - Create arithmetic-enforced DoG kernel.

  • - Convolve the image with DoG kernel.

  • - Normalize output if required.

cryoblob.image.log_kernel(size, sigma, kernel_min=3)

Description

Create a Laplacian of Gaussian kernel for edge detection.

param - size (int):

Kernel size, enforced positive and odd for ‘gaussian’ mode.

param - sigma (scalar_float):

Gaussian standard deviation for LoG kernel.

param - kernel_min (int:

Maximum kernel size (default is 3). This is used to enforce minimum kernel size.

param optional):

Maximum kernel size (default is 3). This is used to enforce minimum kernel size.

returns:

Laplacian kernel.

rtype:
  • kernel (Float[Array, “size size”])

cryoblob.image.exponential_kernel(arr, k)

Description

Create an exponential kernel for image processing.

param - arr (Float[Array:

Input array

param “H W”]):

Input array

param - k (scalar_float):

Exponential decay constant

returns:

Exponential kernel

rtype:
  • kernel (Float[Array, “H W”])

cryoblob.image.perona_malik(image, num_iter, kappa, gamma=0.1, conduction_fn=jaxtyping.jaxtyped)

Perform edge-preserving denoising using the Perona-Malik anisotropic diffusion.

Parameters:
  • (Float[Array (- image) – Input noisy image.

  • W"]) ("H) – Input noisy image.

  • (scalar_int) (- num_iter) – Number of diffusion iterations.

  • (scalar_float) (- kappa) – Conductance coefficient controlling sensitivity to edges.

  • (scalar_float (- gamma) – Diffusion rate (0 < gamma <= 0.25 for stability), default is 0.1.

  • optional) – Diffusion rate (0 < gamma <= 0.25 for stability), default is 0.1.

  • (Callable (- conduction_fn) – Conductivity function, defaults to exponential.

  • optional) – Conductivity function, defaults to exponential.

Returns:

Edge-preserved denoised image.

Return type:

  • denoised_image (Float[Array, “H W”])

Notes

The Perona-Malik equation is given by: u_t = gamma * div(c * grad(u)) + u where: - u is the input image - t is time - gamma is the diffusion rate - c is the conductivity function - grad is the gradient operator - div is the divergence operator

The conductivity function c is typically an exponential function: c(delta) = exp(-delta^2 / kappa^2) where delta is the difference between neighboring pixels.

Perona, Pietro, Takahiro Shiota, and Jitendra Malik. “Anisotropic diffusion.” Geometry-driven diffusion in computer vision (1994): 73-92.

cryoblob.image.histogram(image, bins=256, range_limits=None)

Calculate histogram from input image data.

Parameters:
  • (Real[Array (- image) – Input array (any shape), flattened internally.

  • "..."]) – Input array (any shape), flattened internally.

  • (scalar_int (- bins) – Number of histogram bins.

  • optional) – Number of histogram bins.

  • (Tuple[scalar_float (- range_limits) – Min and max range for bins.

  • scalar_float] – Min and max range for bins.

  • optional) – Min and max range for bins.

Returns:

Histogram counts per bin.

Return type:

  • hist (Num[Array, “bins”])

cryoblob.image.equalize_hist(image, nbins=256, mask=None)

Description

Perform histogram equalization on an image using JAX.

param - image (Real[Array:

Input image to equalize

param “h w”]):

Input image to equalize

param - nbins (scalar_int:

Number of bins for histogram. Default is 256

param optional):

Number of bins for histogram. Default is 256

param - mask (Real[Array:

Optional mask for selective equalization. Default is None (use all pixels)

param “h w”]:

Optional mask for selective equalization. Default is None (use all pixels)

param optional):

Optional mask for selective equalization. Default is None (use all pixels)

returns:

Histogram equalized image

rtype:
  • equalized (Float[Array, “h w”])

cryoblob.image.equalize_adapthist(image, kernel_size=8, clip_limit=0.01, nbins=256)

Description

Perform Contrast Limited Adaptive Histogram Equalization (CLAHE).

param - image (Real[Array:

Input image.

param “h w”]):

Input image.

param - kernel_size (scalar_int:

Size of local regions for histogram equalization. Default is 8.

param optional):

Size of local regions for histogram equalization. Default is 8.

param - clip_limit (scalar_float:

Clipping limit for histogram. Higher values amplify contrast more strongly. Default is 0.01.

param optional):

Clipping limit for histogram. Higher values amplify contrast more strongly. Default is 0.01.

param - nbins (scalar_int:

Number of bins for the histogram. Default is 256.

param optional):

Number of bins for the histogram. Default is 256.

returns:

Image after applying CLAHE.

rtype:
  • equalized_final (Float[Array, “h w”])

Notes

CLAHE performs localized histogram equalization to improve image contrast without amplifying noise excessively. The algorithm:

  • Divides the image into small regions (tiles).

  • Performs local histogram equalization on each tile separately.

  • Clips histograms at the specified limit to prevent noise amplification.

  • Interpolates results to produce a smoothly equalized image.

cryoblob.image.wiener(img, kernel_size=3, noise=None)

Description

JAX implementation of Wiener filter for noise reduction. This is similar to scipy.signal.wiener.

param - img (Float[Array:

The input image to be filtered

param “h w”]):

The input image to be filtered

param - kernel_size (int or tuple:

The size of the sliding window for local statistics. If tuple, represents (height, width). Default is 3

param optional):

The size of the sliding window for local statistics. If tuple, represents (height, width). Default is 3

param - noise (scalar_float:

The noise power. If None, uses the average of the local variance. Default is None

param optional):

The noise power. If None, uses the average of the local variance. Default is None

returns:

The filtered output with the same shape as input

rtype:
  • filtered (Float[Array, “h w”])

Notes

The Wiener filter is optimal in terms of the mean square error. It estimates the local mean and variance around each pixel.

Plotting Module

Module: files

Contains the codes for interfacing with data files. One goal here is to separate the Python code from the JAX code. Thus most of the necessary outward facing code, which is necessarily in Python, is here.

Functions

  • plot_mrc:

    Plot MRC image data using Matplotlib with optional scaling and scalebar.

cryoblob.plots.plot_mrc(mrc_image, image_size=(15, 15), cmap='magma', mode='plain')

Description

Plot an MRC image using Matplotlib with an optional scaling mode and scalebar.

param - mrc_image (MRC_Image):

The PyTree structure containing image data and voxel metadata.

param - image_size (Tuple[scalar_int:

Size of the plotted figure (width, height) in inches. Default is (15, 15).

param scalar_int]:

Size of the plotted figure (width, height) in inches. Default is (15, 15).

param optional):

Size of the plotted figure (width, height) in inches. Default is (15, 15).

param - cmap (str:

The Matplotlib colormap to use. Default is “viridis”.

param optional):

The Matplotlib colormap to use. Default is “viridis”.

param - mode (str:

Mode of visualization: - “plain”: Plot image data without modifications. - “log”: Plot logarithmically scaled image data. - “exp”: Plot exponentially scaled image data. Default is “plain”.

param optional):

Mode of visualization: - “plain”: Plot image data without modifications. - “log”: Plot logarithmically scaled image data. - “exp”: Plot exponentially scaled image data. Default is “plain”.

returns:

Displays the plot.

rtype:

None

Examples

>>> plot_mrc(mrc_image, image_size=(10, 10), cmap="viridis", mode="log")

Type Definitions Module

Module: types

A single location for storing commonly used type aliases and PyTrees along with factory functions for creating them.

Types

  • scalar_float:

    Zero dimensional floating point number

  • scalar_int:

    Zero dimensional integer.

  • scalar_num:

    Zero dimensional number, that can either be a floating point number or an integer.

  • non_jax_number:

    A number that is not a JAX array. This is because even single number are stored as 0D JAX arrays.

PyTrees

  • MRC_Image:

    A PyTree structure for MRC images. Contains the image data and metadata.

Factory Functions

  • make_MRC_Image:

    Factory function to create an MRC_Image instance.

cryoblob.types.make_MRC_Image(image_data, voxel_size, origin, data_min, data_max, data_mean, mode)

Description

Factory function to create an MRC_Image instance.

param - image_data (Num[Array:

The image data array from the MRC file. Can be 2D or 3D.

param “H W”] | Num[Array:

The image data array from the MRC file. Can be 2D or 3D.

param “D H W”]):

The image data array from the MRC file. Can be 2D or 3D.

param - voxel_size (Float[Array:

Voxel size in the order (Z, Y, X).

param “3”]):

Voxel size in the order (Z, Y, X).

param - origin (Float[Array:

Origin coordinates from the MRC file header (Z, Y, X).

param “3”]):

Origin coordinates from the MRC file header (Z, Y, X).

param - data_min (scalar_float):

Minimum value of image data (as stored in header).

param - data_max (scalar_float):

Maximum value of image data (as stored in header).

param - data_mean (scalar_float):

Mean value of image data (as stored in header).

param - mode (scalar_int):

Data type mode from MRC header (e.g., 0: int8, 2: float32).

returns:

An instance of the MRC_Image PyTree structure.

rtype:
  • MRC_Image

Validation Module

Module: valid

Pydantic models for data validation and configuration management in the cryoblob preprocessing pipeline. This module provides type-safe validation for preprocessing parameters, file paths, and blob detection configurations.

Classes

  • PreprocessingConfig:

    Configuration for image preprocessing parameters

  • BlobDetectionConfig:

    Configuration for blob detection parameters

  • FileProcessingConfig:

    Configuration for file processing and batch operations

  • MRCMetadata:

    Validation for MRC file metadata

  • ValidationPipeline:

    Main pipeline class for validating all configurations

class cryoblob.valid.PreprocessingConfig(*args, **kwargs)[source]

Bases: BaseModel

Configuration model for image preprocessing parameters.

This validates all parameters used in the preprocessing function to ensure they are within valid ranges and types before being passed to JAX-compiled functions.

validate_sigma_values()

Ensure sigma values are reasonable for image processing.

validate_conflicting_options()

Ensure conflicting preprocessing options aren’t both enabled.

class Config[source]

Bases: object

frozen = True
extra = 'forbid'
class cryoblob.valid.BlobDetectionConfig(*args, **kwargs)[source]

Bases: BaseModel

Configuration model for blob detection parameters.

Validates parameters used in blob_list_log function.

validate_max_blob_size()

Ensure max_blob_size > min_blob_size.

class Config[source]

Bases: object

frozen = True
extra = 'forbid'
class cryoblob.valid.FileProcessingConfig(*args, **kwargs)[source]

Bases: BaseModel

Configuration model for file processing and batch operations.

Validates parameters used in folder_blobs function.

validate_folder_exists()

Ensure the folder exists and is accessible.

class Config[source]

Bases: object

frozen = True
extra = 'forbid'
class cryoblob.valid.MRCMetadata(*args, **kwargs)[source]

Bases: BaseModel

Validation model for MRC file metadata.

Ensures MRC file headers contain valid values.

validate_data_range()

Ensure data_max > data_min.

validate_mean_in_range()

Ensure data_mean is between data_min and data_max.

class Config[source]

Bases: object

frozen = True
extra = 'forbid'
class cryoblob.valid.AdaptiveFilterConfig(*args, **kwargs)[source]

Bases: BaseModel

Configuration model for adaptive filtering parameters.

Validates parameters used in adaptive_wiener and adaptive_threshold functions.

validate_kernel_size()

Ensure kernel size is odd for proper centering.

class Config[source]

Bases: object

frozen = True
extra = 'forbid'
class cryoblob.valid.ValidationPipeline(*args, **kwargs)[source]

Bases: BaseModel

Main validation pipeline that combines all configuration models.

This provides a single entry point for validating complete processing configurations.

validate_for_single_image()[source]

Validate configuration for single image processing.

Returns:

  • - preprocessing_config (Validated preprocessing parameters)

  • - blob_config (Validated blob detection parameters)

Return type:

beartype.typing.Tuple.(<class ‘cryoblob.valid.PreprocessingConfig’>, <class ‘cryoblob.valid.BlobDetectionConfig’>)

validate_for_batch_processing()[source]

Validate configuration for batch file processing.

Returns:

  • - preprocessing_config (Validated preprocessing parameters)

  • - blob_config (Validated blob detection parameters)

  • - file_config (Validated file processing parameters)

Raises:

ValueError – If file_processing configuration is not provided:

Return type:

beartype.typing.Tuple.(<class ‘cryoblob.valid.PreprocessingConfig’>, <class ‘cryoblob.valid.BlobDetectionConfig’>, <class ‘cryoblob.valid.FileProcessingConfig’>)

validate_for_adaptive_processing()[source]

Validate configuration for adaptive filtering.

Returns:

  • - preprocessing_config (Validated preprocessing parameters)

  • - adaptive_config (Validated adaptive filtering parameters)

Raises:

ValueError – If adaptive_filtering configuration is not provided:

Return type:

beartype.typing.Tuple.(<class ‘cryoblob.valid.PreprocessingConfig’>, <class ‘cryoblob.valid.AdaptiveFilterConfig’>)

to_preprocessing_kwargs()[source]

Convert preprocessing config to kwargs dict for existing functions.

Returns:

- kwargs

Return type:

Dictionary compatible with existing preprocessing function

to_blob_kwargs()[source]

Convert blob detection config to kwargs dict for existing functions.

Returns:

- kwargs

Return type:

Dictionary compatible with existing blob_list_log function

class Config[source]

Bases: object

frozen = True
extra = 'forbid'
cryoblob.valid.create_default_pipeline()[source]

Create a validation pipeline with default settings.

cryoblob.valid.create_high_quality_pipeline()[source]

Create a validation pipeline optimized for high-quality blob detection.

cryoblob.valid.create_fast_pipeline()[source]

Create a validation pipeline optimized for speed.

cryoblob.valid.validate_mrc_metadata(voxel_size, origin, data_min, data_max, data_mean, mode, image_shape)[source]

Validate MRC metadata and return validated model.

Parameters:
  • voxel_size (-)

  • origin (-)

  • data_min (-)

  • data_max (-)

  • data_mean (-)

  • mode (-)

  • image_shape (-)

Returns:

- metadata

Return type:

Validated MRC metadata model

Raises:

ValidationError – If any metadata values are invalid: