We followed the recommended settings and workflows for both methods, and further describe parameter choices below. Since protein neighbors are a mixture of cell types, there is substantial error between predicted and measured RNA expression. Cell 184 , 35733587 (2021). We merged clusters that did not exhibit clear evidence of separation, or where the only differentially expressed features represented ribosomal genes or mitochondrial genes. Keywords: single cell genomics, multimodal analysis, CITE-seq, immune system, T cell, reference mapping, COVID-19 Go to: p value is computed using an unpaired Wilcoxon test. Each of these approaches offers an exciting solution to overcome the inherent limitations of scRNA-seq and to explore how multiple cellular modalities affect cellular state and function (Zhu etal., 2020). Platelets are included as a positive control, as CD69 is constitutively expressed on these cells. p values are computed using an unpaired Wilcoxon test. We sought to design a robust analytical workflow for the integration of multiple measurements collected within the same cell. In addition, our circulating immune atlas was constructed from PBMCs and therefore contains few cells with no nuclei (erythrocytes) or multi-lobed nuclei (granulocytes). In each case, Basal_4 exhibits elevated accessibility at these motif sites. (J) Gene dropout curve for neighbors of regulatory Tcells defined by RNA, ADT, and WNN analysis. Overall clonal diversity was consistent across vaccination time points, consistent with an expected lack of a lymphoid response to vaccination within 7days, and 97% of clones consisted only of a single cell. Essentially, in cases where two methods disagree based on an RNA classification, we attempt to classify the cell based on its protein levels to see if there is strong evidence for one annotation versus another. P.S. Article CAS Google Scholar FOIA The eluate was split and used as template for production of ADT and Hashtag libraries: Hashtag libraries were generated by PCR using Kapa Hifi Master Mix, 10M 10x Genomics SI-PCR primer (5AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTC), and 10M Illumina TruSeq DNA D7xx primer (5CAAGCAGAAGACGGCATACGAGATxxxxxxxxGTGACTGGAGTTCAGACGTGTGC). Red dots denote the k= 20 nearest neighbors to the target dendritic cell based on the transcriptome (A) or protein (B) modalities. Our final step integrates the modalities to create a WNN graph. (B) Expression of protein CD25 and CD57 in these five UMAP visualizations. We use LSI to reduce the dimensionality of ATAC normalized data, and PCA to reduce the dimensionality of protein. Benchmarking and robustness analysis for WNN integration (A) Analysis of a CITE-seq dataset of human bone marrow mononuclear cells and 25 surface proteins. In order to leverage multiple data types to define cellular identity, we developed WNN analysis, a computational method that learns the information content of each modality and generates an integrated representation of multimodal data. In Stuart etal. Each dot denotes an individual gene, and the axis scale of expression is based on default log-normalization in Seurat. Applying WNN to additional multimodal technologies, related to Figure2 (A) Analysis of a publicly available dataset of 11,351 PBMC processed with the 10x Genomics Multiome ATAC+RNA kit. Accessibility We first filtered out cells with that were outliers for the number of detected features from these modalities. Clustering and annotation: To cluster our multimodal dataset, we first used the KNN graph based on the weighted RNA and protein similarities (referred to as the WNN graph), to calculate the Jaccard index (neighborhood overlap) between every pair of cells. FOIA Integrated analysis of multimodal single-cell data Quality control and doublet removal: We considered all cells that were detected in our RNA-seq, cell hashing, and ADT libraries. While PCA will identify the directions that explain maximal variance in the source data, sPCA can help pinpoint sources of variation that are of the greatest interest. Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets and to look beyond the transcriptome toward a unified and multimodal definition of cellular identity. In addition, scMM learns underlying relationships across modalities, enabling crossmodal generation of single-cell data. Eight participants in this trial were selected for single cell analysis from Group 1 (no IL-12) and Group 3 (1000 mcg IL-12) based on sample availability. (BD) UMAP visualization of 161,764 cells 10x 3 cells analyzed based on RNA data (B), protein data (C), or WNN analysis (D). Notably, we observed consistent trends when restricting our analysis only to individuals with either positive or negative CMV Tcell responses (FigureS5). See also FigureS7. (E) We can integrate the modalities by constructing a weighted nearest neighbor (WNN) graph, based on a weighted average of protein and RNA similarities. This would enable the mapping of mass cytometry profiles to our multimodal reference, even in the absence of transcriptomic data. Integrated analysis of multimodal single-cell data The separation of these clusters upon UMAP visualization (FigureS3) was consistent with the number of incorrect naive CD8+/CD4+ edges identified in each representation (RNA KNN: 984, ATAC KNN: 373, WNN: 322). is an employee at BioLegend Inc., which is the exclusive licensee of the New York Genome Center patent application related to this work. Advances in single-cell sequencing: insights from organ transplantation. Search available domains at loopia.com , With LoopiaDNS, you will be able to manage your domains in one single place in Loopia Customer zone. The integrative analysis of single-cell multimodal data provides a powerful framework to determine the correlations that occur among different molecular signals in the various cell types and to quantify the impact of these relationships in defining cell identities [ In Stuart etal. (H) Same as in (F) but for B cell states. (2020). For each cell, we calculate a new set of k-nearest cells based on a metric that reflects the weighted average of normalized RNA and protein similarities (STAR Methods). For both the 10x v3 (3 scRNaseq) and 10x Immune Profiling Solution (5 scRNA-seq), we used Cell Ranger 3.1.0 to align reads to the GRCh38 human genome with default settings. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). Contrastingly, conventional dendritic cells (cDCs), along with a rare population of erythroid progenitors and spiked-in murine 3T3 controls, formed distinct clusters when analyzing RNA but were intermixed with other cell types based on surface protein abundance. Integrated We utilize this dataset to identify and validate heterogeneous cell states in human lymphocytes and explore how the human immune system responds to vaccination and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. The protein data were withheld from the mapping but displays the same patterns as in (A). (N) Absolute log2FC of differentially expressed genes between CD4 Naive and CD8 Naive clusters, where clusters were defined by either RNA or WNN analysis (STAR Methods). Cell 184 , 35733587.e29 (2021). Multimodal single-cell technologies, which simultaneously profile multiple data types in the same cell, represent a new frontier for the discovery and characterization of cell states. -, Argelaguet R., Arnol D., Bredikhin D., Deloro Y., Velten B., Marioni J.C., Stegle O. MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data. Single-cell immunology of SARS-CoV-2 infection. Haghverdi L., Bttner M., Wolf F.A., Buettner F., Theis F.J. Diffusion pseudotime robustly reconstructs lineage branching. See P., Dutertre C.-A., Chen J., Gnther P., McGovern N., Irac S.E., Gunawan M., Beyer M., Hndler K., Duan K. Mapping the human DC lineage through the integration of high-dimensional techniques. 1Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA, 2New York Genome Center, New York, NY 10013, USA, 3Technology Innovation Lab, New York Genome Center, New York, NY 10013, USA, 4Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA, 5Cape Town HVTN Immunology Lab, Hutchinson Cancer Research Institute of South Africa, Cape Town 8001, South Africa, 6Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA, 7Center for Data Visualization, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA, 8BioLegend Inc., San Diego, CA 92121, USA, 9Chan Zuckerberg Biohub, San Francisco, CA 94063, USA. These data demonstrate that while we can achieve substantial enrichment with small panels, isolating pure and homogeneous populations based on small marker panels remains challenging for some clusters. Corgnac S., Boutet M., Kfoury M., Naltet C., Mami-Chouaib F. The Emerging Role of CD8. We report each of these panels in Table S2 to facilitate similar experiments for additional clusters in our dataset. We normalize protein expression levels within a cell using the centered-log ratio (CLR) transform, followed by dimensional reduction with PCA, and subsequently construct a KNN graph. Clark S.J., Argelaguet R., Kapourani C.A., Stubbs T.M., Lee H.J., Alda-Catalinas C., Krueger F., Sanguinetti G., Kelsey G., Marioni J.C. scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells. In FigureS2, we show that we obtain very similar results from the WNN procedure when varying k across a series of values ranging from 10 to 50. Topham D.J., Reilly E.C. We identified nearest neighbors, and performed UMAP visualization on the learned latent space. In particular, a key challenge in the analysis of single-cell multimodal data is to devise efficient computational strategies to integrate different data modalities 8,12,13,14,15. The protein data were withheld from the mapping but displays the same patterns as in (A). For example, CD8+ and CD4+ Tcells were partially blended together when analyzing the transcriptome but separated clearly in the protein data. Integrated analysis of multimodal single-cell data Concordant cell types are identified between query and reference data with three exceptions, denoted with dashed rectangles. We next further explored the performance of our WNN integration, assessed its robustness to fluctuations in data quality, and performed benchmarking against other recently developed methods. Genome Biol. To assist the community in utilizing our resource, we have created a web application, freely available at https://azimuth.hubmapconsortium.org/, which enables users to rapidly map their own datasets online, automating the process of visualization and annotation. In Figure4C we visualize the level of enrichment for each cluster based on panels of one to ten markers. We show precision and recall metrics for each panel in FigureS4, demonstrating that it remains challenging to sort truly homogeneous populations of high-resolution subsets using a small number of markers. deep generative model for integrated Briefly, cells were washed with PBS (Rockwell) and resuspended in 25mM cisplatin (Enzo, Farmingdale, NY, USA) for sixty seconds to stain for viability before being quenched with undiluted FBS. In the reference dataset, we calculated the protein centroids for the CD4 Treg and NK clusters. In addition, we identified Siglec-1 (CD169) as a protein response biomarker that was robustly induced only in day 3 samples (Figure6C). All other parameters were set to default settings. (I) Four basal subpopulations were identified from WNN clustering, and cells from each subpopulation are highlighted in the UMAP visualizations from (H). (D) Pathway enrichment (enrichR) of the top DE genes between day 0 and day 3 myeloid cells exhibits a clear enrichment for components of the interferon response. For Level 1 annotations, for each of the 24 samples, we calculated the percentage of each cell state in each sample, and ran two paired Wilcoxon tests: day 0 versus day 3, and day 0 versus day7. Integrated All authors participated in interpretation and writing the manuscript. For example, although CITE-seq datasets can be analyzed by first identifying clusters based on gene expression values (Peterson etal., 2017; Stoeckius etal., 2017) and subsequently exploring their immunophenotypes, a multimodal computational workflow would define cell states based on both modalities. Instead, these subpopulations may represent cells that are preparing to become tissue-resident and have already begun to acquire distinguishing molecular characteristics. Heatmap displays pseudobulk averages where cells are grouped by cell type, donor, and vaccination time point and demonstrates that markers do not vary across different PBMC samples. In our final annotations, we considered 57 total clusters. (D) The integrated latent space defined by WNN most accurately reconstructs expression levels for 25 proteins. Cell annotations are derived from WNN analysis, which reflect distinct molecular states (see heatmaps in (G-H)). Cell annotations are derived from WNN analysis and reveal heterogeneity within Tcells and progenitors that cannot be discovered by either modality independently. We examined a recently generated dataset of human PBMCs prior to flu vaccination, which measured the transcriptomes of 53,099 cells alongside 82 surface proteins. Proc. Cells are grouped by their WNN-defined Tcell level 2 annotations. Hu H, Liu R, Zhao C, Lu Y, Xiong Y, Chen L, Jin J, Ma Y, Su J, Yu Z, Cheng F, Ye F, Liu L, Zhao Q, Shuai J. RNA Biol. Heatmap displays pseudobulk averages where cells are grouped by cell type, human donor, and technical replicate, and demonstrates that markers are repeatedly detected across samples and replicates. We then construct a KNN graph after dimensional reduction. Same as Figure2D but showing Spearman correlation instead of Pearson correlation. Received 2020 Nov 3; Revised 2021 Mar 3; Accepted 2021 Apr 28. Modality weights were calculated for each cell without knowledge of cell type labels. multimodal Schematic overview of multimodal integration using weighted nearest neighbor analysis. If this percentage exceeded 20%, we reasoned that the cells molecular profile was similar to a verified doublet, and therefore removed it from further analysis. (A) Heatmap of CD4+ Tcell states. (A) Analysis of a publicly available dataset of 11,351 PBMC processed with the 10x Genomics Multiome ATAC+RNA kit. The dataset has been mapped onto the 3-defined multimodal reference, allowing cells to be visualized in the same UMAP space as the reference, and cells are labeled based on transferred Level 2 annotations. Integrated analysis of multimodal single-cell data Highlights Weighted nearest neighbor analysis integrates multimodal single-cell data A multimodal reference atlas of the circulating human immune system Identification and validation of novel sources of lymphoid heterogeneity Tian Y, Carpp LN, Miller HER, Zager M, Newell EW, Gottardo R. Nat Biotechnol. These values were used as input to two paired Wilcoxon tests: day 0 versus day 3, and day 0 versus day7. While the samples contained cells across the full spectrum of hematopoietic differentiation, the antibody panel was designed to separate groups of terminally differentiated cells. We found that for CD8+ Tcells, the most similar RNA neighbors often reflected a mix of CD8+ and CD4+ Tcells (in the RNA KNN graph, there are a total of 944 incorrect edges that connect CD8+ to CD4+ Tcells). B.Z.Y. (B-C) For each of the 57 clusters, we computed targeted immunophenotype panels using forward selection coupled with logistic regression. (C) For each of our 57 clusters, we calculated the optimal surface marker enrichment panels based on our CITE-seq data. Datasets. McKechnie J.L., Beltrn D., Ferreira A.-M.M., Vergara R., Saenz L., Vergara O., Estripeaut D., Araz A.B., Simpson L.J., Holmes S. Mass cytometry analysis of the NK cell receptor-ligand repertoire reveals unique differences between dengue-infected children and adults. (G) Relative abundance of monocyte populations as measured by flow cytometry. We validated the presence of the populations in independent healthy PBMC samples by performing flow cytometry for the same markers (Figures 5C and 5D). While the dataset contains measurements for 82 proteins alongside the transcriptome, we used only the transcriptome for reference mapping and the transfer of Level 2 annotations. Li S.S., Kochar N.K., Elizaga M., Hay C.M., Wilson G.J., Cohen K.W., De Rosa S.C., Xu R., Ota-Setlik A., Morris D., NIAID HIV Vaccine Trials Network DNA Priming Increases Frequency of T-Cell Responses to a Vesicular Stomatitis Virus HIV Vaccine with Specific Enhancement of CD8. Cao J., Cusanovich D.A., Ramani V., Aghamirzaie D., Pliner H.A., Hill A.J., Daza R.M., McFaline-Figueroa J.L., Packer J.S., Christiansen L. Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Cells from the eight most highly represented clonotypes are highlighted as colored dots. In FigureS2L we perform the same analysis, but utilize the standard deviation of gene expression across single cells as an alternative metric to dropout rate for defining variable genes (residual from trendline > 0.5). (D-E) Additional heterogeneity in the expression of inflammatory genes in monocyte populations. Post sorting, samples were each split into quintuplicates, and then cleaned up with 2x SPRI. The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). The potential to catalog and characterize the rich diversity of cell types in the human immune system represents a powerful opportunity for single-cell genomics (Chen etal., 2019a; Gomes etal., 2019; Jaitin etal., 2014; Papalexi and Satija, 2018; Stubbington etal., 2017), yetalso reveals the limitations of current approaches. (C-E) Same as Figure5J, but after splitting the eight volunteers into five CMV+ (C) and three CMV- (D) samples (Table S3). The original publication performed unsupervised clustering on the full dataset and identified six Tcell clusters (three CD4+ T, two CD8+ T, and Tcells). Per-cell quality control metrics were computed using the TSSEnrichment and NucleosomeSignal functions, and cells retained with a nucleosome signal score< 2, TSS enrichment score > 1, and total RNA counts< 100,000 and > 25,000. The RNA and protein modality weights are non-negative, unique to each cell, and sum to 1. We demonstrate that WNN analysis substantially improves our ability to define cellular states in multiple biological contexts and data types. Therefore, the sPCA procedure can identify the set of principal components that can transform the data in a single modality to best capture the structure in a multimodal dataset. In addition to characterizing heterogeneity in mRNA and protein expression, we leveraged our 5 dataset to explore the relationship between molecular state and TCR sequence (STAR Methods). (A, B) Violin plot showing the upregulation of CD169 protein levels and a module of interferon response genes at day 3. See also FigureS3. We also profiled a total of 49,147 cells (54 antibodies) split across all samples using ECCITE-seq (Mimitou etal., 2019), which also enables immune repertoire profiling with the 10x 5 technology. p value is computed using an unpaired Wilcoxon test. scMM effectively infers interpretable joint representations from multimodal single-cell data. We found that the five positive samples accounted for 91% of cells within expanded clones. Installation instructions, tutorials, and documentation for Seurat v4 are available at https://www.satijalab.org/seurat. The resulting m modality weights for each cell are non-negative and together sum to 1. For the 5P libraries, the samples were pooled in a ratio of 70% RNA, 12% ADT, 8% HTO, 5% of TCR libraries (with equal amounts of / and / libraries), and 5% of BCR libraries. These findings were reproducible in independent analyses of the 3 and 5 scRNA-seq experiments and persisted in both day 3 and day 7 samples (Figures 6F and andS6).S6). Weinberger K.Q., Saul L.K. For each subject, PBMCs were collected at three time points: immediately before (day 0), 3days, and 7days following administration of a VSV-vectored HIV vaccine (Figure3A). (I) Same as in (F) but for other cells states. B cell states are subdivided by their mutually exclusive expression of kappa or lambda light chain, with distinguishing markers including IGKC, IGLC3, IGLC3. Cao J., Spielmann M., Qiu X., Huang X., Ibrahim D.M., Hill A.J., Zhang F., Mundlos S., Christiansen L., Steemers F.J. SingleCellMultiModal Introduction - Bioconductor (B-C) For each of the 57 clusters, we computed targeted immunophenotype panels using forward selection coupled with logistic regression. We encourage users to compute both to understand how their dataset can be interpreted in light of a reference, and also to flag any particular populations that may not be well represented. HHS Vulnerability Disclosure, Help Our results indicate that this phenotype does not represent a strictly binary phenomenon and may not be specific to CMV response. Cells are label by the annotation that was transferred using each method. (2020) after reference-mapping. 2019;35:28652867. Pott S. Simultaneous measurement of chromatin accessibility, DNA methylation, and nucleosome phasing in single cells. Samples were then gated as described below and sorted directly into Buffer RLT (QIAGEN). Sample integration (10X 3 CITE-seq experiments): To facilitate the identification of shared cell types across datasets, we applied our previously developed anchor workflow (Stuart etal., 2019) to integrate the datasets. As recommended in the MOFA+ tutorial (https://raw.githack.com/bioFAM/MOFA2_tutorials/master/R_tutorials/10x_scRNA_scATAC.html), we used the z-scored data (scaled data) from the two assays as view1 and view2 for MOFA+. Clones present in donors who were classified as CMV-positive are colored in red. In our supervised analysis, we transferred our level 2 annotations, successfully dividing Tcells into the 12 groups (Figure7C, D). We conclude that WNN analysis is capable of sensitively and robustly characterizing populations that cannot be identified by a single modality, exhibits best-in-class performance, and can be flexibly applied to multiple data types for integrative and multimodal analysis. Cells are colored by their predicted level-2 annotations. We use the top 40 and 50 dimensions respectively to construct KNN graphs from the RNA and protein modalities, which is used as input to the WNN procedure described above. Would you like email updates of new search results? Here, we present scMM, a mixture-of-experts deep generative model for integrated analysis of single-cell multiomics data. In addition, scMM learns underlying relationships across modalities, enabling crossmodal generation of single-cell data. Each column represents a replicate bulk RNA-seq profile. We apply the reference-based integrative analysis procedures described above to project the 5 scRNA-seq data onto the UMAP visualization defined by the 3 dataset, and also to transfer a discrete label. We acknowledge Dan Littman and members of the Satija and Technology Innovation Labs for general discussion. Bendall S.C., Simonds E.F., Qiu P., Amir A.D., Krutzik P.O., Finck R., Bruggner R.V., Melamed R., Trejo A., Ornatsky O.I. In particular, a key challenge in the analysis of single-cell multimodal data is to devise efficient computational strategies to integrate different data modalities 8,12,13,14,15. Grey dots represent Tcells where TCR sequence was measured using the 10x 5 assay. Although flow cytometry and cytometry by time of flight (CyTOF) are widely used and powerful approaches for making high-dimensional measurements of protein expression in immune cells (Bendall etal., 2011; Bodenmiller etal., 2012; Diggins etal., 2015; Saeys etal., 2016), CITE-seqs use of distinct oligonucleotide barcode sequences provides a unique opportunity to profile very large panels of antibodies alongside cellular transcriptomes. (K) CD38 and CD16 protein expression define two separate gradients and are uncorrelated in NK cells. Emulsions were then broken and nucleic acids recovered Subsequent library preparation steps are detailed in the section below. UMAP visualizations are computed using RNA, protein, or WNN analysis. However, although these cell types are defined by both RNA and protein markers, the statistical power in unsupervised analysis of either modality separately was insufficient to identify these populations, demonstrating the importance of joint analysis. (C) Enriched motifs within MAIT-specific open chromatin regions. Integrated analysis of multimodal single-cell data. Copyright This substantially improves the speed of the method. In this manuscript we analyze data falling into three categories: measurements of single-cell gene expression, single-cell surface protein expression, and single-cell chromatin accessibility (ATAC-seq). (A) Analysis of a CITE-seq dataset of human bone marrow mononuclear cells and 25 surface proteins. Ivanov I.I., McKenzie B.S., Zhou L., Tadokoro C.E., Lepelley A., Lafaille J.J., Cua D.J., Littman D.R. (E and F) Differentially expressed genes, and enriched gene ontology terms, between CD103+ CD49a and CD103 CD49+ populations. single-cell data (2020). Post elution, BSA was added to a final concentration of 2%. In addition, scMM learns underlying relationships across modalities, enabling crossmodal generation of single-cell data. We used stepwise variable selection coupled with logistic regression (STAR Methods) to identify the best antibody marker panels of different sizes (110 markers) for each subset, and calculated the level of enrichment in silico (Figure4C).
Madewell Transport Tote Vs Zip Top, Prince Tennis Rackets For Sale, Jeep Patriot Manual Transmission For Sale Near Me, Nissan Juke Front Speaker Replacement, Gallery Wall Frames Mixed, Drunk Elephant Sili Whipped Body Lotion, Town And Country Car Dealership, Neutrogena Toner For Acne-prone Skin, Best Suction Cup Grab Bars For Showers, Blue Tricycle For Toddlers,
Sorry, the comment form is closed at this time.