Distribution Analysis ===================== The distribution module quantifies the spatial arrangement of organelles within the radially partitioned cytoplasm using Radial Distribution Functions (RDF) and related metrics. This is a key innovation in iPA that enables standardized comparison across cells and imaging modalities. Overview -------- Radial Distribution Function (RDF) is defined as: .. math:: RDF(r) = \frac{\rho(r)}{\bar{\rho}} where :math:`\rho(r)` is the organelle density in shell at radial position r, and :math:`\bar{\rho}` is the mean density across all shells. **Interpretation**: - RDF > 1: Enrichment (organelles cluster in this region) - RDF = 1: Random/uniform distribution - RDF < 1: Depletion (organelles avoid this region) This normalized metric eliminates biases from cell size variations and enables direct comparison of organelle localization patterns across different experimental conditions and imaging modalities. Core Functions -------------- Radial Distribution Function (RDF) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ calculate_rdf_from_xvg ^^^^^^^^^^^^^^^^^^^^^^ .. autofunction:: ipa.analysis.calculate_rdf_from_xvg Calculates RDF from partition coordinates and organelle mask. This is the primary function for quantifying organelle spatial distribution. **Algorithm**: 1. Map organelle voxels to their corresponding radial shells 2. Calculate organelle density in each shell (count / shell volume) 3. Normalize by mean density across all shells 4. Return RDF values and supporting statistics **Parameters**: - ``organelle_mask`` (np.ndarray): Binary mask of the organelle (same shape as partition mask) - ``partition_coords`` (dict): Dictionary mapping partition ID (int) to coordinate arrays (from ``extract_partition_coordinates()``) - ``radial_positions`` (dict): Dictionary mapping voxel coordinates (tuple) to normalized radial positions (0=NE, 1=PM) - ``bins`` (int, optional): Number of radial bins. Default: 8 **Returns**: - ``rdf_results`` (dict): Dictionary containing: - ``radii`` (np.ndarray): Radial positions (0-1, where 0=NE, 1=PM) - ``rdf`` (np.ndarray): RDF values for each bin - ``counts`` (np.ndarray): Raw organelle counts per bin - ``densities`` (np.ndarray): Normalized densities per bin - ``shell_volumes`` (np.ndarray): Volume of each shell **Biological Significance**: RDF analysis reveals how organelles are distributed relative to cellular landmarks (NE and PM). For example, in pancreatic β-cells, ISGs show enrichment near the PM under low glucose (docking) and redistribution toward the NE under high glucose (new biogenesis). **Example**: .. code-block:: python from ipa.analysis import calculate_rdf_from_xvg from ipa.processing.partitioning import Partitioning import numpy as np # Step 1: Create partitions partitioner = Partitioning(root_dir="results/", n_slices=8) center, ne_edge, pm_edge = partitioner.extract_ne_pm_edges(pm_mask, ne_mask) partition_mask = partitioner.create_nepm_radial_partitions( ne_edge, pm_edge, shape=volume.shape, n_slices=8, pm_mask=pm_mask, ne_mask=ne_mask ) # Step 2: Prepare inputs partition_coords = {} radial_positions = {} for i in range(1, 9): coords = np.array(np.where(partition_mask == i)).T partition_coords[i] = coords norm_pos = (i - 0.5) / 8.0 # Center of shell for coord in coords: radial_positions[tuple(coord)] = norm_pos # Step 3: Calculate RDF for ISG rdf_results = calculate_rdf_from_xvg( isg_mask, partition_coords, radial_positions, bins=8 ) # Step 4: Visualize import matplotlib.pyplot as plt plt.plot(rdf_results['radii'], rdf_results['rdf'], 'o-', label='ISG') plt.axhline(y=1.0, color='k', linestyle='--', alpha=0.5, label='Random') plt.xlabel('Radial Position (0=NE, 1=PM)') plt.ylabel('RDF') plt.title('ISG Radial Distribution') plt.legend() plt.savefig('isg_rdf.png') load_partition_coords_from_xvg ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. autofunction:: ipa.analysis.load_partition_coords_from_xvg Loads partition coordinates and radial positions from XVG format files generated by the partitioning module. **Parameters**: - ``xvg_file`` (str): Path to XVG coordinate file **Returns**: - ``partition_coords`` (dict): Dictionary mapping partition ID to coordinate arrays - ``radial_positions`` (dict): Dictionary mapping coordinates to radial positions **Example**: .. code-block:: python from ipa.analysis import load_partition_coords_from_xvg partition_coords, radial_positions = load_partition_coords_from_xvg( "results/partitions/sample_01_partition_coords.xvg" ) print(f"Loaded {len(partition_coords)} partitions") for pid, coords in partition_coords.items(): print(f" Partition {pid}: {len(coords)} points") save_radial_rdf_results ^^^^^^^^^^^^^^^^^^^^^^^ .. autofunction:: ipa.analysis.save_radial_rdf_results Saves RDF results to CSV and JSON formats for downstream analysis and visualization. **Parameters**: - ``rdf_results`` (dict): RDF results from ``calculate_rdf_from_xvg()`` - ``output_dir`` (str): Output directory path - ``data_id`` (str): Dataset identifier for filename **Output Files**: - ``{data_id}_rdf.csv``: Tabular data with radii, RDF, counts, densities - ``{data_id}_rdf.json``: Complete results including metadata **Example**: .. code-block:: python from ipa.analysis import save_radial_rdf_results save_radial_rdf_results( rdf_results, output_dir="results/rdf/", data_id="sample_01" ) # Creates: results/rdf/sample_01_rdf.csv and .json Spatial Arrangement Analysis ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Beyond RDF, the distribution module provides additional tools for analyzing organelle spatial organization. analyze_docked_granules ^^^^^^^^^^^^^^^^^^^^^^^ .. autofunction:: ipa.analysis.analyze_docked_granules Classifies organelles (e.g., ISGs) into docked (readily releasable pool, RRP) and reserve pools based on distance to plasma membrane. **Biological Context**: In pancreatic β-cells, ISGs are functionally divided into: - **Docked ISGs (RRP)**: Located within ~300 nm of PM, ready for immediate secretion - **Reserve pool**: Remaining ISGs in cytoplasm This classification is critical for understanding glucose-stimulated insulin secretion (GSIS). **Parameters**: - ``img_isg`` (np.ndarray): ISG mask or labeled instances - ``mem_mask`` (np.ndarray): Plasma membrane binary mask - ``isg_data`` (np.ndarray): ISG feature data (from ``extract_isg_features()``) or center coordinates - ``distance_threshold`` (float, optional): Distance threshold for docking (pixels). Default: 5 (~300 nm for SIM) - ``show_visualization`` (bool, optional): Whether to generate visualization. Default: False **Returns**: - ``results`` (dict): Dictionary containing: - ``total_granules`` (int): Total number of granules - ``docked_granules`` (int): Number of docked granules - ``docked_ratio`` (float): Docking ratio (0-1) - ``distances`` (np.ndarray): Array of distances from each ISG to PM **Example**: .. code-block:: python from ipa.analysis import analyze_docked_granules, extract_isg_features from skimage import measure # Extract ISG centers def extract_isg_centers(isg_mask): labeled = measure.label(isg_mask > 0) regions = measure.regionprops(labeled) centers = np.array([r.centroid for r in regions if r.area > 5]) return centers isg_mask = load_isg_mask() # Your loading function pm_mask = load_pm_mask() isg_centers = extract_isg_centers(isg_mask) # Analyze docking results = analyze_docked_granules( img_isg=isg_mask, mem_mask=pm_mask.astype(np.uint8), isg_data=isg_centers, distance_threshold=5, # ~300 nm for SIM show_visualization=True ) print(f"Total ISGs: {results['total_granules']}") print(f"Docked ISGs: {results['docked_granules']}") print(f"Docking ratio: {results['docked_ratio']:.2%}") analyze_organelle_aggregation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. autofunction:: ipa.analysis.analyze_organelle_aggregation Analyzes organelle aggregation in time-lapse images. Aggregation is defined as the set of organelles that remain within a distance less than a certain threshold across continuous frames. **Biological Context**: Organelle aggregation is important for understanding cellular processes such as: - **Granule clustering**: Multiple vesicles aggregating before secretion - **Mitochondrial networks**: Mitochondria forming interconnected networks - **Endosomal sorting**: Endosomes aggregating during cargo sorting **Algorithm**: 1. For each frame, cluster organelles based on pairwise distances 2. Organelles within ``dthreshold`` are considered part of the same aggregate 3. Track aggregates across frames (simplified matching) 4. Return cluster assignments for each frame **Parameters**: - ``centers_list`` (list of np.ndarray): List of arrays containing organelle center coordinates for each frame - ``dthreshold`` (float): Distance threshold (in micrometers) to define an aggregation. Default: 0.6 **Returns**: - ``dict``: Dictionary mapping aggregation IDs to their tracking history and volume **Example**: .. code-block:: python from ipa.analysis import analyze_organelle_aggregation import numpy as np # Simulate time-lapse organelle positions # Frame 1: 5 organelles frame1_centers = np.array([[10, 20, 30], [11, 21, 31], [50, 60, 70], [80, 90, 100], [81, 91, 101]]) # Frame 2: 5 organelles (some moved) frame2_centers = np.array([[10.5, 20.5, 30.5], [11.5, 21.5, 31.5], [51, 61, 71], [80.5, 90.5, 100.5], [81.5, 91.5, 101.5]]) centers_list = [frame1_centers, frame2_centers] # Analyze aggregation with 0.6 μm threshold results = analyze_organelle_aggregation( centers_list, dthreshold=0.6 ) # Check clusters per frame for frame_idx, clusters in enumerate(results['clusters']): print(f"Frame {frame_idx}: {len(np.unique(clusters[clusters >= 0]))} aggregates") calculate_distributional_similarity ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. autofunction:: ipa.analysis.calculate_distributional_similarity Calculates Pearson correlation between two RDF distributions to measure distributional similarity across imaging modalities (e.g., SIM vs SXT). **Biological Context**: Distributional similarity analysis enables comparison of organelle spatial patterns between different imaging techniques. For example, comparing ISG distributions from SIM (which labels both immature and mature ISGs) versus SXT (which only detects dense-core mature ISGs) reveals differences in ISG maturation states. **Algorithm**: 1. Take two RDF arrays from different modalities 2. Calculate Pearson correlation coefficient using ``scipy.stats.pearsonr`` 3. Return correlation coefficient (r), p-value, and significance test **Parameters**: - ``rdf1`` (np.ndarray): RDF values from modality 1 (e.g., SIM) - ``rdf2`` (np.ndarray): RDF values from modality 2 (e.g., SXT) - ``alpha`` (float, optional): Significance level for p-value. Default: 0.05 **Returns**: - ``dict``: Dictionary containing: - ``pearson_r`` (float): Pearson correlation coefficient (-1 to 1) - ``p_value`` (float): Statistical significance - ``significant`` (bool): Whether correlation is significant (p < alpha) **Interpretation**: - **r > 0.8**: Strong positive correlation (similar distributions) - **0.5 < r < 0.8**: Moderate correlation - **r < 0.5**: Weak or no correlation (different distributions) - **p < 0.05**: Statistically significant correlation **Example**: .. code-block:: python from ipa.analysis import calculate_distributional_similarity import numpy as np # RDF from SIM data (8 radial shells) sim_rdf = np.array([0.8, 1.0, 1.2, 1.1, 0.9, 1.0, 1.3, 1.5]) # RDF from SXT data (same 8 shells) sxt_rdf = np.array([0.9, 1.1, 1.1, 1.0, 1.0, 1.1, 1.2, 1.4]) # Calculate similarity result = calculate_distributional_similarity(sim_rdf, sxt_rdf) print(f"Pearson r: {result['pearson_r']:.2f}") print(f"P-value: {result['p_value']:.4f}") print(f"Significant: {result['significant']}") # Output: # Pearson r: 0.91 # P-value: 0.0012 # Significant: True analyze_anchored_organelle_angles ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. autofunction:: ipa.analysis.analyze_anchored_organelle_angles Analyzes angles of anchored tubular organelles (e.g., F-actin) relative to the plasma membrane. **Biological Context**: Anchored organelles are defined by their proximity to the PM. For example, anchored F-actin filaments are those located within 120 nm of the PM. The angles between pairs of anchored filaments reveal the organization of the cortical cytoskeleton and its role in regulating cellular processes like secretion. **Algorithm**: 1. Use KD-tree to find filaments within ``distance_threshold`` of PM 2. For each anchored filament, calculate direction vector (from first to last point) 3. Calculate angles between all pairs of anchored filaments using dot product 4. Convert angles to 0-90° range (acute angles) 5. Return statistics (count, mean angle, std) **Parameters**: - ``filament_coords_dict`` (dict): Dictionary of filament coordinates {filament_id: [[x,y,z], ...]} - ``pm_mask`` (np.ndarray): Plasma membrane binary mask - ``distance_threshold_nm`` (float, optional): Distance threshold (nm) to define anchored organelles. Default: 120 - ``voxel_size_nm`` (tuple, optional): Voxel dimensions (nm). Default: (1, 1, 1) **Returns**: - ``dict``: Dictionary containing: - ``anchored_count`` (int): Number of anchored filaments - ``angles`` (list): List of angles between anchored filament pairs (degrees) - ``mean_angle`` (float): Mean angle - ``std_angle`` (float): Standard deviation of angles **Interpretation**: - **Small mean angle** (< 30°): Filaments are parallel/aligned (organized network) - **Large mean angle** (> 60°): Filaments are randomly oriented (disorganized) - **High anchored_count**: Extensive cortical network - **Low anchored_count**: Sparse cortical structure **Example**: .. code-block:: python from ipa.analysis import analyze_anchored_organelle_angles import numpy as np # Create sample filament data (coordinates in voxels) filaments = { 'filament_1': [[10, 20, 30], [11, 21, 31], [12, 22, 32]], 'filament_2': [[15, 25, 35], [16, 26, 36], [17, 27, 37]], 'filament_3': [[20, 30, 40], [21, 31, 41], [22, 32, 42]] } # Load PM mask pm_mask = load_plasma_membrane_mask() # Your loading function # Analyze anchored angles (SIM voxel size: 31.3 x 31.3 x 90.9 nm) results = analyze_anchored_organelle_angles( filaments, pm_mask, distance_threshold_nm=120, voxel_size_nm=(31.3, 31.3, 90.9) ) print(f"Anchored filaments: {results['anchored_count']}") print(f"Mean angle: {results['mean_angle']:.2f}°") print(f"Std angle: {results['std_angle']:.2f}°") print(f"All angles: {results['angles']}") reconstruct_4d_mask ^^^^^^^^^^^^^^^^^^^ .. autofunction:: ipa.analysis.reconstruct_4d_mask Reconstructs 4D mask data (x, y, z, time) from coordinate information for time-lapse analysis. **Parameters**: - ``data`` (pd.DataFrame): Coordinate data with columns for x, y, z, time - ``resolution`` (tuple): Image resolution (width, height, depth) - ``rescale_x`` (float, optional): X-axis rescaling factor. Default: 1.0 - ``rescale_y`` (float, optional): Y-axis rescaling factor. Default: 1.0 **Returns**: - ``reconstructed_image`` (np.ndarray): 4D reconstructed image (Z, Y, X, T) - ``ves_coords`` (list): Vesicle coordinate list per timepoint **Example**: .. code-block:: python from ipa.analysis import reconstruct_4d_mask import pandas as pd # Load coordinate data from WFM tracking data = pd.read_csv("vesicle_positions.csv", skiprows=3) # Reconstruct 4D data img_reconstructed, ves_coords = reconstruct_4d_mask( data, resolution=(512, 512, 100), rescale_x=2.1, rescale_y=1.7 ) print(f"Reconstructed shape: {img_reconstructed.shape}") analyze_vesicle_clusters ^^^^^^^^^^^^^^^^^^^^^^^^ .. autofunction:: ipa.analysis.analyze_vesicle_clusters Analyzes vesicle clustering patterns and spatial organization in time-lapse data. **Parameters**: - ``sample_name`` (str): Sample identifier - ``img_ch2`` (np.ndarray): Channel 2 image data (e.g., F-actin) - ``img_reconstructed`` (np.ndarray): Reconstructed 4D image from ``reconstruct_4d_mask()`` - ``ves_coords`` (list): Vesicle coordinate list **Returns**: - ``results`` (dict): Dictionary containing cluster analysis results including: - Cluster sizes - Cluster persistence across frames - Spatial distribution statistics **Example**: .. code-block:: python from ipa.analysis import analyze_vesicle_clusters, reconstruct_4d_mask import pandas as pd # Load and reconstruct data = pd.read_csv("vesicle_positions.csv", skiprows=3) img_reconstructed, ves_coords = reconstruct_4d_mask(data, resolution=(512, 512, 100)) # Analyze clusters results = analyze_vesicle_clusters( sample_name="sample_01", img_ch2=actin_channel, img_reconstructed=img_reconstructed, ves_coords=ves_coords ) print(f"Number of clusters: {len(results['clusters'])}") print(f"Mean cluster size: {results['mean_cluster_size']:.2f}") Enhanced Distribution Methods ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ actinangle2dist_enhanced ^^^^^^^^^^^^^^^^^^^^^^^^ .. autofunction:: ipa.analysis.actinangle2dist_enhanced Enhanced analysis of actin filament angle-distance distributions. This function calculates the relationship between actin filament orientation and their distance to reference structures. **Parameters**: - ``data_id`` (str): Dataset identifier - ``actin_file`` (str): Actin coordinates file - ``config`` (dict): Configuration dictionary **Returns**: - ``results`` (dict): Enhanced distribution metrics Complete RDF Workflow Example ------------------------------ This example demonstrates a complete RDF analysis workflow from partitioning to visualization: .. code-block:: python from ipa.processing.partitioning import Partitioning from ipa.analysis import ( calculate_rdf_from_xvg, load_partition_coords_from_xvg, save_radial_rdf_results ) from ipa.processing.partitioning import plot_radial_rdf import numpy as np import matplotlib.pyplot as plt # Step 1: Create partitions (if not already done) partitioner = Partitioning(root_dir="results/", n_slices=8) center, ne_edge, pm_edge = partitioner.extract_ne_pm_edges(pm_mask, ne_mask) partition_mask = partitioner.create_nepm_radial_partitions( ne_edge, pm_edge, shape=volume.shape, n_slices=8, pm_mask=pm_mask, ne_mask=ne_mask ) # Step 2: Extract coordinates partition_coords = partitioner.extract_partition_coordinates( partition_mask, sampling_density=0.05 ) # Prepare radial positions radial_positions = {} for i in range(1, 9): coords = partition_coords[i] norm_pos = (i - 0.5) / 8.0 for coord in coords: radial_positions[tuple(coord)] = norm_pos # Save partition coordinates partitioner.save_partition_coords_to_xvg( partition_coords, dataid="sample_01", output_dir="results/partitions/" ) # Step 3: Calculate RDF for multiple organelles organelles = { 'ISG': isg_mask, 'Mito': mito_mask, 'Actin': actin_mask } rdf_all = {} for name, mask in organelles.items(): rdf_results = calculate_rdf_from_xvg( mask, partition_coords, radial_positions, bins=8 ) rdf_all[name] = rdf_results # Save results save_radial_rdf_results( rdf_results, output_dir="results/rdf/", data_id=f"sample_01_{name.lower()}" ) # Step 4: Visualize comparison fig, ax = plt.subplots(figsize=(8, 6)) for name, rdf_results in rdf_all.items(): ax.plot(rdf_results['radii'], rdf_results['rdf'], 'o-', label=name) ax.axhline(y=1.0, color='k', linestyle='--', alpha=0.5, label='Random') ax.set_xlabel('Radial Position (0=NE, 1=PM)', fontsize=12) ax.set_ylabel('RDF', fontsize=12) ax.set_title('Organelle Radial Distributions', fontsize=14) ax.legend(fontsize=10) ax.grid(True, alpha=0.3) plt.tight_layout() plt.savefig('results/rdf/organelle_comparison.png', dpi=300) plt.show() print("RDF analysis complete!") for name, rdf_results in rdf_all.items(): print(f"{name}: mean RDF = {np.mean(rdf_results['rdf']):.3f}")