Data from: Molecular docking and dynamics studies to identify novel active compounds targeting potential breast cancer receptor proteins from an indigenous herb Euphorbia thymifolia Linn
Data files
Jan 22, 2024 version files 10.04 MB
-
desmond_md_job_36912_TETRAOXATETRADECANE_114-DIYL_DIBENZOATE_mol_2-out_pl_IXKK.pdf
-
desmond_md_job_SUCCINIC_ACID_2_DIMETHYLAMINOETHYL_4_ISOPROPYLPHENYL_ESTER_mol-out_pl_ER.pdf
-
desmond_md_job_Tetraoxatatredecanedibenzoate_4QTB.pdf
-
Docking_with_1XKK_protein.csv
-
Docking_with_2IOG_protein.csv
-
Docking_with_4J52_protein.csv
-
Docking_with_4QTB___4EKL_proteins.xlsx
-
Docking_with_5H2U.csv
-
Docking_with_5K00.csv
-
README.md
-
Supplementary_data_file_1_GCMS_dataset.pdf
Apr 02, 2024 version files 10.04 MB
-
desmond_md_job_36912_TETRAOXATETRADECANE_114-DIYL_DIBENZOATE_mol_2-out_pl_IXKK.pdf
-
desmond_md_job_SUCCINIC_ACID_2_DIMETHYLAMINOETHYL_4_ISOPROPYLPHENYL_ESTER_mol-out_pl_ER.pdf
-
desmond_md_job_Tetraoxatatredecanedibenzoate_4QTB.pdf
-
Docking_with_1XKK_protein.csv
-
Docking_with_2IOG_protein.csv
-
Docking_with_4J52_protein.csv
-
Docking_with_4QTB___4EKL_proteins.xlsx
-
Docking_with_5H2U.csv
-
Docking_with_5K00.csv
-
README.md
-
Supplementary_data_file_1_GCMS_dataset.pdf
Abstract
Breast cancer has become most prevalent disease and their incidence has doubled in Indian scenario. Targeted therapy with the novel compounds derived from plants could be the promising approach for the development of drugs. Euphorbia thymifolia L is a widely growing tropical herb which has been reported for its various ethnopharmacological properties, including anticancer properties. The aim of the present study was to identify the active phytocompounds present in the methanolic extract using an In-silico approach. The methanolic extract of E. thymifolia (ME.ET) was subjected to GC-MS analysis and the identified compounds were docked with potential protein targets implicated in breast cancer such as ERK1, AKT, EGFR/HER2, ER, MELK, PLK1, PTK6. Compounds with good docking score were further subjected to dynamics study to understand the protein ligand binding stability, ligand pathway calculation, molecular mechanics energies combined with Poisson-Boltzmann (MM/PBSA) calculation using Schrodinger suite. Out of 219 unique phytocompounds subjected to docking, two compounds namely, 3,6,9,12-tetraoxatetradecane-1,14-diyl dibenzoate (TTDB) and succinic acid, 2-(dimethylamino)ethyl 4-isopropylphenyl ester (SADPE) showed good docking score. Molecular dynamics study showed high affinity and low binding energy for TTDB with HER2, ERK1 and SADPE with ER. Hence this is the first study to identify and report active compounds from E.thymifolia linn. Further invitro and invivo anticancer studies can be performed to confirm these results and understand the molecular mechanism by which TTDB and SADPE exhibit anticancer activity against breast cancer.
README: Molecular docking and dynamics studies to identify novel active compounds targeting potential breast cancer receptor proteins from an indigenous herb Euphorbia thymifolia Linn
https://doi.org/10.5061/dryad.rn8pk0pjt
Description of the data and file structure
Euphorbia thymifolia is an ethnopharmacologically used herb that grows widely in the tropical countries. The aerial parts of Euphorbia thymifolia L. were procured from the uncultivated areas of Udupi District during the month of April to June. The plant material was identified and authenticated by the Taxonomist Dr K. Gopalakrishna Bhat, Professor of Botany (Rtd.), Poorna Prajna College, Udupi. The plant sample has been deposited in the Herbarium of the Department of Pharmacognosy, Manipal college of pharmaceutical sciences. (Herbarium Voucher no: PP624) The aerial parts of Euphorbia thymifolia L. were collected, washed thoroughly and carefully with distilled water to remove all the soil and debris. The plant material was shade dried at room temperature for 15 days until dry and crisp. After drying the shoots of E. thymifolia L., they were powdered and stored at 4℃.
Methanolic extract by cold maceration method & GC-MS Analysis.
The crude extract (10% w/v) 35 g in 350 ml was prepared by cold maceration using methanol as the solvent for 72 hours with intermittent shaking at room temperature. The extracts were then filtered using Whatmann no.1 filter paper. The clear extracts were then concentrated by rotary vacuum flash evaporator and freeze dried by lyophilisation. These lyophilized crude extracts were then stored at 4o C until further use. To analyse the phytoconstituents present in the methanolic extract of E.thymifolia (ME.ET), it was further subjected to GC-MS analysis at Analytical Research & Metallurgical Laboratories Pvt. Ltd. Bangalore, India. Supplementary data file 1 gives the results of the GCMS analysis revealing all the phytoconstituents present in the ME.ET. There were 23 prominent peaks with each retention time peak having several hits, where each of these hits were recognised by comparing their retention time peak, peak area (%), height to that of the already known ligands identified by the National Institute of Standards and Technology (NIST) library. The name of all the ligands identified are listed below their retention time in the GCMS dataset.
Molecular docking:
A total of 219 ligands were unique and duplicates were avoided. The structures of above identified compounds were obtained from PubChem / Chem draw software and subjected to molecular docking with seven of the potential breast cancer receptor proteins (ERK1, AKT, EGFR/HER2, ER, MELK, PLK1, PTK6). Schrodinger suite, commercial Maestro software was used for docking simulation studies to identify the best hit or protein-ligand complex. The dataset of the docking results are separately provided for each protein in the same format as obtained directly in the downloadable manner (excel file) from the software.
To select the best protein–ligand complex, molecular docking was performed using Schrodinger suite. Commercial Maestro software version 11.8 (OPLS3e force field) was utilised for all the simulation studies. Protein preparation wizard was used for preparation of protein using the panel review, modify and refinement modules where the side chains and residues if any missing were filled, water molecules beyond 3 Å were removed. GLIDE panel was used for receptor-grid generation to create the grid and locate the receptor at the binding site for ligands to be docked. The size of the grid box generated at the site of inbound ligand for each protein was 10*10*10 Å.
Crystal structures of the following proteins were retrieved from protein data bank. ERK1 (Protein Data Bank ID: 4QTB, Crystallographic precision of the PDB: 1.40 Å) bound to a piperazine-phenyl-pyrimidine derivative; AKT (Protein Data Bank ID: 4EKL, Crystallographic precision of the PDB: 2.00 Å) with an ATP site inhibitor which is a piperazine- pyrimidine derivative ligand ; EGFR/HER2 (Protein Data Bank ID: 1XKK, Crystallographic precision of the PDB: 2.40 Å) with bound ligand Lapatinib is a tyrosine kinase inhibitor in clinical development for cancer ; ER (Protein Data Bank ID: 2IOG, Crystallographic precision of the PDB: 1.60 Å) with indole ligand ; The crystal structure of MELK (PDB ID: 5K00, Crystallographic precision of the PDB: 1.77 Å) with a ligand which is an amide derivative; Crystal structure of PLK1 (PDB ID: 4J52, Crystallographic precision of the PDB: 2.30 Å) with pyrimidodiazepinone as ligand , and Crystal structure of PTK6 (PDB ID: 5H2U,Crystallographic precision of the PDB: 2.24 Å) with ligand Dasatinib .
The structure of all the 219 phytocompounds were either retrieved from the PubChem or derived from the Chem-Draw software. These 219 ligands were imported into maestro and “LIGPREP” panel was utilised for ligand preparation (Rathi et al. 2019). Docking was then performed using the glide module of Maestro. Initially, all the ligands were docked at standard precision (SP) mode then followed by extra precision (XP) mode to gain a ligand docking XP score (Elokely and Doerksen 2013).
Free ligand binding energy calculation by Maestro (MM‑GBSA)
All the XP docking files corresponding to each phytocompound and the seven proteins were then subjected to MM-GBSA free ligand energy calculation using PRIME module, to understand the binding energy of the target protein listed above with all the 219 phytocompounds.
Docking dataset description:
Each of these dataset gives the details of name of the protein and ligands processed under entry name column. Corresponding to the protein processed, the protein data bank ID and crystal structure details are mentioned. The column XP GScore/ docking score provides the insight on the protein ligand complex binding stability. Greater this number with a negative sign indicates greater binding stability between protein and the ligand. The column glide energy provides the binding energy of the complex.
Molecular dynamic (MD) simulations
To determine the protein ligand stability under the simulated physiological environment, DESMOND panel of maestro was used for the MD simulation assessment which was executed on HER2 (1XKK), ERK1 (4QTB) with 3,6,9,12-tetraoxatetradecane-1,14-diyl dibenzoate (TTDB) (Molecular formula: C24H30O8) and ER(2IOG) with succinic acid, 2-(dimethylamino) ethyl 4-isopropylphenyl ester (SADPE) (Molecular formula: C17H25NO4) for a 100 ns using the XP docking file. Three steps namely system builder, minimization and MD simulation were involved in this process. A predefined simple point charge model (SPC) was used for the creation of an orthorhombic boundary, for the XP docked complex of TTDB with 1XKK and 4QTB and 2IOG with SADPE. The charges were neutralised where three positive charges was neutralized by the addition of three chloride ions and one negative charge was neutralised by one sodium ion. Isothermal–isobaric (NPT) ensemble with a constant temperature of 300 K and 1 bar pressure was maintained throughout the simulation for 100ns. To determine the stability of the complex, Root Mean Square Deviation (RMSD) was analysed along with protein–ligand contact timeline and covalent/ non-covalent interactions.
Molecular dynamics dataset description:
The best protein ligand complex with good docking score was chosen for further dynamics study. Molecular dynamics is a computer-based simulation performed in the Desmond panel of Maestro software to analyse the physical interaction between the protein and ligand for a fixed period of time. In our study, the simulation was carried out for 100ns. The result is obtained from the software as simulation interaction diagram report (readable pdf files) comprising of the all details of the simulations carried out in the first page and interaction reports thereafter.
Code/Software:
Molecular docking was performed using Schrodinger suite. Commercial Maestro software version 11.8 (OPLS3e force field) was utilised for all the simulation studies.
Methods
Euphorbia thymifolia is an ethnopharmacologically used herb that grows widely in the tropical countries. The aerial parts of Euphorbia thymifolia L. were procured from the uncultivated areas of Udupi District during the month of April to June. The plant material was identified and authenticated by the Taxonomist Dr K. Gopalakrishna Bhat, Professor of Botany (Rtd.), Poorna Prajna College, Udupi. The plant sample has been deposited in the Herbarium of the Department of Pharmacognosy, Manipal college of pharmaceutical sciences. (Herbarium Voucher no: PP624) The aerial parts of Euphorbia thymifolia L. were collected, washed thoroughly and carefully with distilled water to remove all the soil and debris. The plant material was shade dried at room temperature for 15 days until dry and crisp. After drying the shoots of E. thymifolia L., they were powdered and stored at 4℃.
Methanolic extract by cold maceration method & GC-MS Analysis.
The crude extract (10% w/v) 35 g in 350 ml was prepared by cold maceration using methanol as the solvent for 72 hours with intermittent shaking at room temperature. The extracts were then filtered using Whatmann no.1 filter paper. The clear extracts were then concentrated by rotary vacuum flash evaporator and freeze dried by lyophilisation. These lyophilized crude extracts were then stored at 4o C until further use. To analyse the phytoconstituents present in the methanolic extract of E.thymifolia (ME.ET), it was further subjected to GC-MS analysis at Analytical Research & Metallurgical Laboratories Pvt. Ltd. Bangalore, India. Supplementary data file 1 gives the results of the GCMS analysis revealing all the phytoconstituents present in the ME.ET. There were 23 prominent peaks with each retention time peak having several hits, where each of these hits were recognised by comparing their retention time peak, peak area (%), height to that of the already known ligands identified by the National Institute of Standards and Technology (NIST) library. The name of all the ligands identified are listed below their retention time in the GCMS dataset.
Molecular docking:
A total of 219 ligands were unique and duplicates were avoided. The structures of above identified compounds were obtained from PubChem / Chem draw software and subjected to molecular docking with seven of the potential breast cancer receptor proteins (ERK1, AKT, EGFR/HER2, ER, MELK, PLK1, PTK6). Schrodinger suite, commercial Maestro software was used for docking simulation studies to identify the best hit or protein-ligand complex. The dataset of the docking results are separately provided for each protein in the same format as obtained directly in the downloadable manner (excel file) from the software.
Docking dataset description:
Each of these dataset gives the details of name of the protein and ligands processed under entry name column. Corresponding to the protein processed, the protein data bank ID and crystal structure details are mentioned. The column XP GScore/ docking score provides the insight on the protein ligand complex binding stability. Greater this number with a negative sign indicates greater binding stability between protein and the ligand. The column glide energy provides the binding energy of the complex.
Molecular dynamics dataset description:
The best protein ligand complex with good docking score was chosen for further dynamics study. Molecular dynamics is a computer-based simulation performed in the Desmond panel of Maestro software to analyse the physical interaction between the protein and ligand for a fixed period of time. In our study, the simulation was carried out for 100ns. The result is obtained from the software as simulation interaction diagram report (readable pdf files) comprising of the all details of the simulations carried out in the first page and interaction reports thereafter.