See how this article has been cited at scite.ai
scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.
Bulk RNA sequencing (RNA-seq) of blood is an established key technology for analyzing gene expression in health and disease. In order to interpret the immune system's behavior, it is necessary to additionally consider complete blood count (CBC) data offering insights into the abundance of immune cells. However, CBC data is frequently unavailable in published data sets. We employ multiple datasets of patients infected with various SARS-CoV-2 variants (in total 240 samples with up to 200 million reads sequencing depth) to showcase that computational cell-type deconvolution methods (e.g., MCP-counter, xCell, EPIC, quanTIseq) could make such data sets more insightful by estimating immune cell abundances. Furthermore, we can observe varying levels of lymphocyte exhaustion and increased neutrophil levels between SARS-CoV-2 variants and disease progression, indicating markers that could be used in everyday clinical practice to estimate the disease severity of a newly admitted patient once the utilization of RNA-seq becomes feasible in clinics. Additionally, we employ the data to screen for B and T cell receptor (BCR/TCR) sequences using the tools MiXCR and TRUST4 to show that - combined with sequence alignments and pBLAST - they could be used to classify a patient's disease. Finally, we investigated the sequencing depth necessary to perform such analyses and concluded that 10 million reads per sample is sufficient. In conclusion, our study reveals that computational cell-type deconvolution and BCR/TCR methods can supplement missing CBC data in bulk RNA-seq analyses and offer insights into immune responses, disease severity, and pathogen-specific immunity, all achievable with a sequencing depth of 10 million reads per sample.