SPACEc: Patch proximity analysis

Patch proximity analysis (PPA) analyses neighborhoods as patches of closely connected cells. The goal of the analysis is to analyze was surrounds these patches within a user defined radius. In our example we will use PPA to identify germinal centers as CN patches and then analyze was surrounds them based on the tissue condition (tonsil vs. tonsillitis).

# import spacec
import spacec as sp

#import standard packages
import os
import scanpy as sc

# silencing warnings
import warnings
warnings.filterwarnings('ignore')
INFO:root: * TissUUmaps version: 3.1.1.6
/home/tim/miniforge3/envs/spacec/lib/python3.9/site-packages/numba/np/ufunc/parallel.py:371: NumbaWarning: The TBB threading layer requires TBB version 2021 update 6 or later i.e., TBB_INTERFACE_VERSION >= 12060. Found TBB_INTERFACE_VERSION = 12050. The TBB threading layer is disabled.
  warnings.warn(problem)
/home/tim/miniforge3/envs/spacec/lib/python3.9/site-packages/cudf/utils/gpu_utils.py:89: UserWarning: A GPU with NVIDIA Volta™ (Compute Capability 7.0) or newer architecture is required.
Detected GPU 0: NVIDIA GeForce GTX 1070                                                                                                                                                                                               
Detected Compute Capability: 6.1
  warnings.warn(
root_path = "/home/user/path/SPACEc/" # inset your own path

data_path = root_path + 'data/' # where the data is stored

# where you want to store the output
output_dir = root_path + '_outs/'
os.makedirs(output_dir, exist_ok=True)
sc.settings.set_figure_params(dpi=80, facecolor='white')
adata = sc.read(output_dir + 'adata_nn_demo_annotated.h5ad')
adata
AnnData object with n_obs × n_vars = 46789 × 59
    obs: 'DAPI', 'x', 'y', 'area', 'region_num', 'unique_region', 'condition', 'leiden_1', 'leiden_1_subcluster', 'cell_type_coarse', 'cell_type_coarse_subcluster', 'cell_type_coarse_f', 'cell_type_coarse_f_subcluster', 'cell_type_coarse_f_f', 'cell_type', 'CN_k20_n40', 'CN_k20_n30', 'CN_k20_n20', 'CN_k20_n25', 'CN_k20_n6', 'CN_k20_n6_annot'
    uns: 'CN_k20_n6_colors', 'Centroid_k20_n20', 'Centroid_k20_n25', 'Centroid_k20_n30', 'Centroid_k20_n40', 'Centroid_k20_n6', 'cell_type_coarse_f_colors', 'cell_type_colors', 'dendrogram_cell_type_coarse_f_subcluster', 'leiden', 'leiden_1_colors', 'neighbors', 'ppa_result_100', 'ppa_result_150', 'ppa_result_200', 'ppa_result_250', 'ppa_result_50', 'triDist', 'triDist_keyname', 'umap', 'unique_region_colors'
    obsm: 'X_pca', 'X_umap'
    layers: 'scaled'
    obsp: 'connectivities', 'distances'

Setting the plotting parameter to True illustrates what the PPA function is detecting. This might be helpful if the min cluster size is unknown or the user wants to perform quality control. The results are stored as df in the adata.uns slot.

Compute for proximal cells

# this region result is also saved to adata.uns
region_results = sp.tl.patch_proximity_analysis(
    adata, 
    region_column = "unique_region", # column with the region information
    patch_column = "CN_k20_n6_annot", # column with the patch information (derive patches from this column)
    group="Germinal Center", # group to consider
    min_cluster_size=50, # minimum cluster size to consider
    x_column='x', y_column='y', # spatial coordinates
    radius = (5/0.5085), # to get the distance in µm
    edge_neighbours = 3, # number of neighbours to consider for edge detection
    key_name = 'ppa_result_5', # key name to store the result in adata.uns
    plot = True) # plot detection for demonstration purposes
Estimated number of clusters: 4
Estimated number of noise points: 7
../_images/c0a97aff5391caf794028cfeda6a52eda47eb9da5439b510cb1350080f4f49ef.png
Finished reg002_Germinal Center
Estimated number of clusters: 2
Estimated number of noise points: 1
../_images/8320fb9b2de01fa69c0fb49e6e1247500fe7e513e5ef7c94546114160354d223.png
Finished reg001_Germinal Center
region_results
(         DAPI            x            y   area region_num unique_region  \
 0  121.210000  2176.510000  1814.550000  100.0          1        reg002   
 2  117.117021  1995.148936  1462.212766   94.0          1        reg002   
 0  129.582524  2429.368932  1558.776699  103.0          0        reg001   
 
      condition leiden_1 leiden_1_subcluster cell_type_coarse  ...  \
 0  tonsillitis        6                 6,0              GCB  ...   
 2  tonsillitis       15                15,1           B cell  ...   
 0       tonsil       15                15,1           B cell  ...   
 
   cell_type_coarse_f_f cell_type CN_k20_n40 CN_k20_n30 CN_k20_n20  CN_k20_n25  \
 0                  GCB       GCB          1          1          1           1   
 2               B cell    B cell          1          1          1           1   
 0               B cell    B cell          1          1          1           1   
 
    CN_k20_n6  CN_k20_n6_annot  patch_id                  unique_patch_ID  
 0          1    Marginal Zone         2  reg002_Marginal Zone_patch_no_2  
 2          1    Marginal Zone         2  reg002_Marginal Zone_patch_no_2  
 0          1    Marginal Zone         1  reg001_Marginal Zone_patch_no_1  
 
 [3 rows x 23 columns],
              DAPI            x            y   area region_num unique_region  \
 1837   113.646724   396.188034   568.669516  351.0          1        reg002   
 1745    63.307339   382.293578   560.876147  218.0          1        reg002   
 1751    49.727679   383.803571   581.084821  224.0          1        reg002   
 1972    74.031519   413.765043   581.971347  349.0          1        reg002   
 2004    54.897098   419.155673   599.828496  379.0          1        reg002   
 ...           ...          ...          ...    ...        ...           ...   
 46358   49.146409  2494.881215  1466.886740  362.0          0        reg001   
 46285   58.472813  2485.087470  1501.004728  423.0          0        reg001   
 46525   50.608407  2512.601770  1455.918142  452.0          0        reg001   
 46358   49.146409  2494.881215  1466.886740  362.0          0        reg001   
 46648   84.843949  2525.611465  1475.210191  314.0          0        reg001   
 
          condition leiden_1 leiden_1_subcluster cell_type_coarse  ...  \
 1837   tonsillitis        0                   0           B cell  ...   
 1745   tonsillitis        6                 6,1              GCB  ...   
 1751   tonsillitis        6                 6,1              GCB  ...   
 1972   tonsillitis        6                 6,0              GCB  ...   
 2004   tonsillitis        6                 6,0              GCB  ...   
 ...            ...      ...                 ...              ...  ...   
 46358       tonsil        6                 6,3              GCB  ...   
 46285       tonsil        6                 6,3              GCB  ...   
 46525       tonsil        6                 6,3              GCB  ...   
 46358       tonsil        6                 6,3              GCB  ...   
 46648       tonsil        6                 6,3              GCB  ...   
 
       cell_type_coarse_f_subcluster cell_type_coarse_f_f cell_type CN_k20_n40  \
 1837                         B cell               B cell    B cell          4   
 1745                            GCB                  GCB       GCB          4   
 1751                            GCB                  GCB       GCB          4   
 1972                            GCB                  GCB       GCB          4   
 2004                            GCB                  GCB       GCB          4   
 ...                             ...                  ...       ...        ...   
 46358                           GCB                  GCB       GCB          4   
 46285                           GCB                  GCB       GCB          4   
 46525                           GCB                  GCB       GCB          4   
 46358                           GCB                  GCB       GCB          4   
 46648                           GCB                  GCB       GCB          4   
 
       CN_k20_n30  CN_k20_n20  CN_k20_n25  CN_k20_n6  CN_k20_n6_annot  cluster  
 1837           4           4           4          4  Germinal Center        0  
 1745           4           4           4          4  Germinal Center        0  
 1751           4           4           4          4  Germinal Center        0  
 1972           4           4           4          4  Germinal Center        0  
 2004           4           4           4          4  Germinal Center        0  
 ...          ...         ...         ...        ...              ...      ...  
 46358          4           4           4          4  Germinal Center        1  
 46285          4           4           4          4  Germinal Center        1  
 46525          4           4           4          4  Germinal Center        1  
 46358          4           4           4          4  Germinal Center        1  
 46648          4           4           4          4  Germinal Center        1  
 
 [939 rows x 22 columns])

Often it is more informative to derive the cellular content within a range of distances.

pixel_to_um = 0.5085 # conversion factor from pixel to µm
distances = [5/pixel_to_um, 10/pixel_to_um, 15/pixel_to_um, 20/pixel_to_um, 25/pixel_to_um] # distances in µm
key_names = ['ppa_result_5', 'ppa_result_10', 'ppa_result_15', 'ppa_result_20', 'ppa_result_25'] # key names to store the results

for distance, key_name in zip(distances, key_names): # loop over distances and key names
    # this region result is also saved to adata.uns
    region_results = sp.tl.patch_proximity_analysis(
        adata, 
        region_column = "unique_region", 
        patch_column = "CN_k20_n6_annot", 
        group="Germinal Center",
        min_cluster_size=50, 
        x_column='x', y_column='y', 
        radius = distance, # to get the distance in µm
        edge_neighbours = 3,
        key_name = key_name,
        plot = False)
Estimated number of clusters: 4
Estimated number of noise points: 7
Finished reg002_Germinal Center
Estimated number of clusters: 2
Estimated number of noise points: 1
Finished reg001_Germinal Center
Estimated number of clusters: 4
Estimated number of noise points: 7
Finished reg002_Germinal Center
Estimated number of clusters: 2
Estimated number of noise points: 1
Finished reg001_Germinal Center
Estimated number of clusters: 4
Estimated number of noise points: 7
Finished reg002_Germinal Center
Estimated number of clusters: 2
Estimated number of noise points: 1
Finished reg001_Germinal Center
Estimated number of clusters: 4
Estimated number of noise points: 7
Finished reg002_Germinal Center
Estimated number of clusters: 2
Estimated number of noise points: 1
Finished reg001_Germinal Center
Estimated number of clusters: 4
Estimated number of noise points: 7
Finished reg002_Germinal Center
Estimated number of clusters: 2
Estimated number of noise points: 1
Finished reg001_Germinal Center
# save adata
adata.write(output_dir + 'adata_nn_demo_annotated_cn.h5ad')
... storing 'region_num' as categorical
... storing 'unique_region' as categorical
... storing 'condition' as categorical
... storing 'leiden_1' as categorical
... storing 'leiden_1_subcluster' as categorical
... storing 'cell_type_coarse' as categorical
... storing 'cell_type_coarse_subcluster' as categorical
... storing 'cell_type_coarse_f' as categorical
... storing 'cell_type_coarse_f_subcluster' as categorical
... storing 'cell_type_coarse_f_f' as categorical
... storing 'cell_type' as categorical
... storing 'CN_k20_n6_annot' as categorical

SPACEc can visualize the PPA results as donut plot showing the percentages of cell types or CNs within a given radius around the patches. Percentages are averaged over all regions in the selected condition. The donut plot can show up to five distances.

Visualization

# Donut plots for cell types around Germinal Center
sp.pl.ppa_res_donut(adata, 
                palette=None, # color palette
                cat_col = "cell_type", # column with the cell type or CN information
                key_names = ['ppa_result_5', 'ppa_result_10', 'ppa_result_15', 'ppa_result_20', 'ppa_result_25'], # key names to plot
                radii = [5, 10, 15, 20, 25], # radii in µm
                unit = 'µm', # unit
                figsize = (10,10), # figure size
                add_guides = True, # add guides
                text = 'Cell types around Germinal Center', # text to display in the middle
                label_color='white', # distance label color
                subset_column = 'condition', # column to subset
                subset_condition = 'tonsil', # condition to subset
                title='Tonsil patch proximity analysis') # title

sp.pl.ppa_res_donut(adata, 
                palette=None,
                cat_col = "cell_type",
                key_names = ['ppa_result_5', 'ppa_result_10', 'ppa_result_15', 'ppa_result_20', 'ppa_result_25'],
                radii = [5, 10, 15, 20, 25],
                unit = 'µm',
                figsize = (10,10),  
                add_guides = True,
                text = 'Cell types around Germinal Center',
                label_color='white',
                subset_column = 'condition',
                subset_condition = 'tonsillitis',
                title='Tonsillitis patch proximity analysis')
Key 0: ppa_result_25
Key 0 has 1720 rows.
Key 1: ppa_result_20
Key 1 has 1271 rows.
Key 2: ppa_result_15
Key 2 has 850 rows.
Key 3: ppa_result_10
Key 3 has 456 rows.
Key 4: ppa_result_5
Key 4 has 3 rows.
../_images/a9a7d329968e4537b56706129ecc59c2acc3d80665832c80f1a62aa3ca10345d.png
Key 0: ppa_result_25
Key 0 has 1720 rows.
Key 1: ppa_result_20
Key 1 has 1271 rows.
Key 2: ppa_result_15
Key 2 has 850 rows.
Key 3: ppa_result_10
Key 3 has 456 rows.
Key 4: ppa_result_5
Key 4 has 3 rows.
../_images/e16ef069ff01061bd022878d7a51280867de4682ec01873b15e46bd948fa7024.png
# Donut plots for CNs around Germinal Center
sp.pl.ppa_res_donut(adata, 
                palette=None,
                cat_col = "CN_k20_n6_annot",
                key_names = ['ppa_result_5', 'ppa_result_10', 'ppa_result_15', 'ppa_result_20', 'ppa_result_25'],
                radii = [5, 10, 15, 20, 25],
                unit = 'µm',
                figsize = (10,10),  
                add_guides = True,
                text = 'CNs around Germinal Center',
                label_color='white',
                subset_column = 'condition',
                subset_condition = 'tonsil',
                title='Tonsil patch proximity analysis')

sp.pl.ppa_res_donut(adata, 
                palette=None,
                cat_col = "CN_k20_n6_annot",
                key_names = ['ppa_result_5', 'ppa_result_10', 'ppa_result_15', 'ppa_result_20', 'ppa_result_25'],
                radii = [5, 10, 15, 20, 25],
                unit = 'µm',
                figsize = (10,10),  
                add_guides = True,
                text = 'CNs around Germinal Center',
                label_color='white',
                subset_column = 'condition',
                subset_condition = 'tonsillitis',
                title='Tonsillitis patch proximity analysis')
Key 0: ppa_result_25
Key 0 has 1720 rows.
Key 1: ppa_result_20
Key 1 has 1271 rows.
Key 2: ppa_result_15
Key 2 has 850 rows.
Key 3: ppa_result_10
Key 3 has 456 rows.
Key 4: ppa_result_5
Key 4 has 3 rows.
../_images/26c683c74aa6fc346e0a673afd66da798842c02cc7006fddb6467965e01bc7a1.png
Key 0: ppa_result_25
Key 0 has 1720 rows.
Key 1: ppa_result_20
Key 1 has 1271 rows.
Key 2: ppa_result_15
Key 2 has 850 rows.
Key 3: ppa_result_10
Key 3 has 456 rows.
Key 4: ppa_result_5
Key 4 has 3 rows.
../_images/f3d70d408e5a689c2644d641959fdcd9e80c70f17d1c7b43a6305925704e1d6e.png