Skip to main content

Assessment of hepatitis B viral lymphotrophism using deep curation

Cite this dataset

Sharma, Prachie; Kumar, Kapila; Rawal, Kamal (2022). Assessment of hepatitis B viral lymphotrophism using deep curation [Dataset]. Dryad.


Background: The replicative forms of the hepatitis B virus (HBV) is found in several types of white blood cells within the host defense system. To determine the dimensionality of the extrahepatic manifestation of HBV in host white blood cells, it is important to understand the complete biology of its pathogenesis and lymphotropic nature.

Methods: Deep curation of the literature from the PubMed database pertaining to the HBV manifestation in the human host white blood cells was conducted and then manually filtered to determine the behavioral trend of the virus within the human white blood cells.

Results: The curation of 198 research articles identified 28 genes, 92 proteins, and 20 Peripheral Blood Mononuclear cells involved in HBV pathogenesis, while 20 immune cells were found to be permissive for the viral penetration and replication. The presence of the replicative forms of HBV in the host immune cells led to the further elucidation of 28 genes and 92 proteins that interact with one or more viral genes and proteins.

Conclusions: A multi-dimensional analysis using deep curation identified a possible lymphotropic character of HBV. Moreover, there are certain pathways that could aid in the propagation of viral infection by using immune cells to its advantage. Thus, instead of eliminating HBV, the immune system may contribute to the population expansion of the virus.


The data pertaining to the direct or indirect interaction evidence between the HBV proteins and the host PBMCs proteins and genes were derived from scientific publications. These interaction data were derived from publications available in various repositories such as NCBI, PubMed and Google Scholar using keyword searches. The initial keywords ‘Hepatitis B’ and ‘Human’ were used, which generated 88,113 research articles on epidemiology, clinical-trials, vaccinations, therapeutics that were beyond the scope of the hypothesis. Subsequently, the keywords ‘Hepatitis B’ and ‘lymphotropism’ were used and retrieved 360 research articles. However, the literature was vague, and the information deviated from the key area of research. Similar issues were observed with the keywords ‘Hepatitis B’ and ‘Extrahepatic Manifestation’, which generated 29 hits but limited concrete information pertaining to our hypothesis. Eventually, an exhaustive list of keywords was prepared that contained all the known HBV genes and proteins, all molecules of host PBMCs such as dendritic cells, HLA, and synonyms for PBMCs such as white blood cells, lymphocytes, and leukocytes. The list included relational keywords to identify a correlation between the viral and host proteins during the text mining, including inhibit, replicate, upregulate, downregulate, significant, and higher. The keywords were extracted by searching through the research articles that pertained to HBV and human host protein in PubMed published in the past ten years. The keywords were matched across all the abstracts downloaded from PubMed using keywords ‘HBV’ and ‘Human’ using Python scripts.