Prediksi Interaksi Drug Target pada Gen Kanker Menggunakan Metode Lasso-XGBoost

Penulis

  • Muh Fadhil Al-Haaq Ginoga Institut Pertanian Bogor, Bogor
  • Wisnu Ananta Kusuma Institut Pertanian Bogor, Bogor
  • Mushthofa Mushthofa Institut Pertanian Bogor, Bogor

DOI:

https://doi.org/10.25126/jtiik.20231036603

Abstrak

Pengobatan kanker saat ini sering dilakukan dengan kemoterapi menggunakan obat kimia dan dapat menyebabkan efek samping. Alternatif pengobatan dapat menggunakan senyawa herbal yang diketahui memiliki efek samping lebih sedikit. Analisis Drug Target Interaction (DTI) dapat dilakukan untuk mengetahui interaksi senyawa herbal terhadap protein kanker. Pada penelitian ini dilakukan perancangan model prediksi DTI dengan melakukan seleksi fitur pada dataset menggunakan Least Absolute Shrinkage and Selection Operator (LASSO) lalu dilakukan penyeimbangan data dengan Synthetic Minority Oversampling Technique (SMOTE) dan diprediksi menggunakan Extreme Gradient Boosting (XGBoost). Data protein terkait kanker didapatkan dari daftar Cancer Gene Census, dari daftar tersebut dilakukan penelusuran pada database GDSC, DrugCentral, dan DrugBank untuk menghasilkan daftar senyawa obat yang berinteraksi dengan protein tersebut. Selain itu, senyawa herbal dihasilkan dari database HerbalDB dan Knapsack. Pengujian dilakukan pada beberapa jenis ekstraksi fitur seperti CTD, DC, PseAAC, dan PSSM. Hasil prediksi menunjukkan beberapa senyawa herbal seperti andrographolide, ursolic acid dan oleanolic acid memiliki interaksi pada protein terkait kanker. Selain itu, LASSO-XGBoost dapat memprediksi DTI pada kanker dengan skor F1 0,861; AUROC 0,927; recall 0,85; precision 0,866; dan accuracy 0,897.

 

Abstract

Currently, cancer treatment is usually done with chemotherapy using chemical drugs that can cause side effects. An alternative treatment can use herbal compounds that known have fewer side effects. Drug Target Interaction analysis (DTI) can be performed to determine the interaction of herbal compounds with cancer proteins. In this study, a DTI prediction model is built by selecting features on the data set using Least Absolute Shrinkage and Selection Operator (LASSO) then data balancing performed with Synthetic Minority Oversampling Technique (SMOTE) and Extreme Gradient Boosting (XGBoost) performed to predict the interaction. The cancer-associated protein data were obtained from the Cancer Gene Census list, then the list used to search on the GDSC, DrugCentral and DrugBank databases to generate a list of drug compounds that interact with these proteins. In addition, plant compounds to be generated from the HerbalDB and Knapsack databases. Tests were performed on several types of feature extraction such as CTD, DC, PseAAC and PSSM. Predictive results suggest that several herbal compounds such as andrographolide, ursolic acid and oleanolic acid interact with cancer-associated proteins. In addition, LASSO-XGBoost was able to predict DTI in cancer with score of F1 0,861; AUROC 0,927; recall 0,857, precision 0,866; and accuracy 0,897.


Downloads

Download data is not yet available.

Referensi

ABHIJEET, R.P. dan SANGJIN, K., 2020. Combination of ensembles of regularized regression models with resampling-based lasso feature selection in high dimensional data. Math, 8(1), p.110.

ACHIWA, Y., HASEGAWA, K. dan UDAGAWA, Y., 2013. Effect of ursolic acid on MAPK in cyclin D1 signaling and RING-type E3 ligase (SCF E3s) in two endometrial cancer cell lines. Nutrition and cancer, 65(7), pp.1026-1033.

AFENDI, F.M., OKADA, T., YAMAZAKI, M., HIRAI-MORITA, A., NAKAMURA, Y., NAKAMURA, K., IKEDA, S., TAKAHASHI, H., ALTAF-UL-AMIN, M., DARUSMAN, L.K. dan SAITO, K., 2012. KNApSAcK family databases: integrated metabolite–plant species databases for smultifaceted plant research. Plant and Cell Physiology, 53(2), pp.e1-e1.

APWEILER, R., BAIROCH, A., WU, C.H., BARKER, W.C., BOECKMANN, B., FERRO, S., GASTEIGER, E., HUANG, H., LOPEZ, R., MAGRANE, M. dan MARTIN, M.J., 2004. UniProt: the universal protein knowledgebase. Nucleic acids research, 32(suppl_1), pp.D115-D119.

AVRAM, S., BOLOGA, C.G., HOLMES, J., BOCCI, G., WILSON, T.B., NGUYEN, D.T., CURPAN, R., HALIP, L., BORA, A., YANG, J.J. dan KNOCKEL, J., 2021. DrugCentral 2021 supports drug discovery and repositioning. Nucleic acids research, 49(D1), pp.D1160-D1169.

BHATIA, A., KAUR, G. dan SEKHON, H.K., 2015. Anticancerous efficacy of betulinic acid: An immunomodulatory phytochemical. J. PharmaSciTech, 4, pp.39-46.

CHABNER, B.A. dan ROBERTS, T.G., 2005. Timeline: Chemotherapy and the war on cancer. Nat Rev Cancer, 5(1), pp.65-72.

CHEN, T. dan GUESTRIN, C., 2016, August. Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp. 785-794.

DATTA, S., DEV, V.A. dan EDEN, M.R., 2017. Developing QSPR for predicting DNA drug binding affinity of 9-Anilinoacridine derivatives using correlation-based adaptive LASSO algorithm. Computer Aided Chemical Engineering, Vol. 40, pp. 2767-2772. Elsevier

ESCANDELL, J.M., RECIO, M.C., GINER, R.M., MANEZ, S., CERDA-NICOLAS, M., MERFORT, I. dan RÍOS, J.L., 2010. Inhibition of delayed-type hypersensitivity by cucurbitacin R through the curbing of lymphocyte proliferation and cytokine expression by means of nuclear factor AT translocation to the nucleus. Journal of Pharmacology and Experimental Therapeutics, 332(2), pp.352-363.

FAWCETT, T., 2006. An introduction to ROC analysis. Pattern recognition letters, 27(8), pp.861-874.

FUTREAL, P.A., COIN, L., MARSHALL, M., DOWN, T., HUBBARD, T., WOOSTER, R., RAHMAN, N. dan STRATTON, M.R., 2004. A census of human cancer genes. Nature reviews cancer, 4(3), pp.177-183.

GÜRLER, S.B., KIRAZ, Y. dan BARAN, Y., 2020. Flavonoids in cancer therapy: current and future trends. Di dalam: Biodiversity and Biomedicine. Academic Press, pp.403-440.

HERNANI, P., 2011. Pengembangan Biofarmaka Sebagai Obat Herbal Untuk Kesehatan. Buletin Teknologi Pascapanen Pertanian. 7(1), pp.20-9.

KANJOORMANA, M. dan KUTTAN, G., 2010. Antiangiogenic activity of ursolic acid. Integrative Cancer Therapies, 9(2), pp.224-235.

KASSI, E., SOURLINGAS, T.G., SPILIOTAKI, M., PAPOUTSI, Z.,

PRATSINIS, H., ALIGIANNIS, N. dan MOUTSATSOU, P., 2009. Ursolic acid triggers apoptosis and Bcl-2 downregulation in MCF-7 breast cancer cells. Cancer investigation, 27(7), pp.723-733.

KIM, S., CHEN, J., CHENG, T., GINDULYTE, A., HE, J., HE, S., LI, Q., SHOEMAKER, B.A., THIESSEN, P.A., YU, B. dan ZASLAVSKY, L., 2021. PubChem in 2021: new data content and improved web interfaces. Nucleic acids research, 49(D1), pp.D1388-D1395.

LIU, C., NADIMINTY, N., TUMMALA, R., CHUN, J.Y., LOU, W., ZHU, Y., SUN, M., EVANS, C.P., ZHOU, Q. dan GAO, A.C., 2011. Andrographolide targets androgen receptor pathway in castration-resistant prostate cancer. Genes & cancer, 2(2), pp.151-159.

LYU, X., ZHANG, X., SUN, L., WANG, J. dan WANG, D., 2022. Inhibitory Effect of Ursolic Acid on Proliferation and Migration of Renal Carcinoma Cells and Its Mechanism. Computational Intelligence and Neuroscience, 2022.

MAHMUD, S.H., CHEN, W., JAHAN, H., DAI, B., DIN, S.U. dan DZISOO, A.M., 2020. DeepACTION: A deep learning-based method for predicting novel drug-target interactions. Analytical biochemistry, 610, p.113978.

MATHUR, G., NAIN, S. dan SHARMA, P.K., 2015. Cancer: an overview. Acad J Cancer Res. 8(1), pp.01-09.

MILIANI, M., NOUAR, M., PARIS, O., LEFRANC, G., MENNECHET, F. dan ARIBI, M., 2018. Thymoquinone potently enhances the activities of classically activated macrophages pulsed with necrotic jurkat cell lysates and the production of antitumor Th1-/M1-related cytokines. Journal of Interferon & Cytokine Research, 38(12), pp.539-551.

MOHAMMADI, A., ZAHIRI, J., MOHAMMADI, S., KHODARAHMI, M. dan ARAB, S.S., 2022. PSSMCOOL: a comprehensive R package for generating evolutionary-based descriptors of protein sequences from PSSM profiles. Biology Methods and Protocols, 7(1), p.bpac008.

NASR, F.A., NOMAN, O.M., ALQAHTANI, A.S., QAMAR, W., AHAMAD, S.R., AL-MISHARI, A.A., ALYHYA, N. dan FAROOQ, M., 2020. Phytochemical constituents and anticancer activities of Tarchonanthus camphoratus essential oils grown in Saudi Arabia. Saudi Pharmaceutical Journal, 28(11), pp.1474-1480.

National Cancer Institute of American National Institutes of Health, 2013. Cancer drugs information. FDA Approval for Trastuzumab. HER2-Overexpressing Breast Cancer

NUSSBAUMER, S., BONNABRY, P., VEUTHEY, J.L. dan FLEURY-SOUVERAIN, S., 2011. Analysis of anticancer drugs: a review. Talanta, 85(5), pp.2265-2289.

PARK, J.U., KANG, B.Y., LEE, H.J., KIM, S., BAE, D., PARK, J.H. dan KIM, Y.R., 2017. Tetradecanol reduces EL-4 T cell growth by the down regulation of NF-κB mediated IL-2 secretion. European Journal of Pharmacology, 799, pp.135-142.

PENG, Y., WANG, J., WU, Z., ZHENG, L., WANG, B., LIU, G., LI, W. dan TANG, Y., 2022. MPSM-DTI: prediction of drug–target interaction via machine learning based on the chemical structure and protein sequence. Digital Discovery, 1(2), pp.115-126.

PRATHEESHKUMAR, P. dan KUTTAN, G., 2011. Oleanolic acid induces apoptosis by modulating p53, Bax, Bcl-2 and caspase-3 gene expression and regulates the activation of transcription factors and cytokine profile in B16F. Journal of Environmental Pathology, Toxicology and Oncology, 30(1).

RAJAGOPAL, S., KUMAR, R.A., DEEVI, D.S., SATYANARAYANA, C. dan RAJAGOPALAN, R., 2003. Andrographolide, a potential cancer therapeutic agent isolated from Andrographis paniculata. Journal of Experimental therapeutics and Oncology, 3(3), pp.147-158.

REFAEILZADEH, P., TANG, L., LIU, H. dan LIU, L., 2009. Encyclopedia of database systems. Cross-validation, pp.532-538. Springer.

SCHWAGER, J. dan SCHULZE, J., 1998. Modulation of interleukin production by ascorbic acid. Veterinary immunology and immunopathology, 64(1), pp.45-57.

SHAN, J.Z., XUAN, Y.Y., ZHENG, S., DONG, Q. dan ZHANG, S.Z., 2009. Ursolic acid inhibits proliferation and induces apoptosis of HT-29 colon cancer cells by inhibiting the EGFR/MAPK pathway. Journal of Zhejiang University SCIENCE B, 10(9), pp.668-674.

SHI, H., LIU, S., CHEN, J., LI, X., MA, Q. dan YU, B., 2019. Predicting drug-target interactions using Lasso with random forest based on evolutionary information and chemical structure. Genomics, 111(6), pp.1839-1852.

SONDKA, Z., BAMFORD, S., COLE, C.G., WARD, S.A., DUNHAM, I. dan FORBES, S.A., 2018. The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers. Nature Reviews Cancer, 18(11), pp.696-705.

SYAHDI, R.R., IQBAL, J.T., MUNIM, A. dan YANUAR, A., 2019. HerbalDB 2.0: Optimization of construction of three-dimensional chemical compound structures to update Indonesian medicinal plant database. Pharmacognosy Journal, 11(6).

TANG, Z.Y., LI, Y., TANG, Y.T., MA, X.D. dan TANG, Z.Y., 2022. Anticancer activity of oleanolic acid and its derivatives: Recent advances in evidence, target profiling and mechanisms of action. Biomedicine & Pharmacotherapy, 145, p.112397.

THAFAR, M.A., OLAYAN, R.S., ALBARADEI, S., BAJIC, V.B., GOJOBORI, T., ESSACK, M. dan GAO, X., 2021. DTi2Vec: Drug–target interaction prediction using network embedding and ensemble learning. Journal of cheminformatics, 13(1), pp.1-18.

TIBSHIRANI, R., 2011. Regression shrinkage and selection via the lasso: a retrospective. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(3), pp.273-282.

WANG, L., LI, C., LIN, Q., ZHANG, X., PAN, H., XU, L., SHI, Z., OUYANG, D. dan HE, X., 2015. Cucurbitacin E suppresses cytokine expression in human Jurkat T cells through down-regulating the NF-κB signaling. Acta biochimica et biophysica Sinica, 47(6), pp.459-465.

WIJAYA, S.H., AFENDI, F.M., BATUBARA, I., HUANG, M., ONO, N., KANAYA, S. dan ALTAF-UL-AMIN, M., 2021. Identification of Targeted Proteins by Jamu Formulas for Different Efficacies Using Machine Learning Approach. Life, 11(8), p.866.

WISHART, D.S., KNOX, C., GUO, A.C., SHRIVASTAVA, S., HASSANALI, M., STOTHARD, P., CHANG, Z. dan WOOLSEY, J., 2006. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic acids research, 34(suppl_1), pp.D668-D672.

XU, L., RU, X. dan SONG, R., 2021. Application of machine learning for drug–target interaction prediction. Frontiers in Genetics, p.1077.

YAMANISHI, Y., ARAKI, M., GUTTERIDGE, A., HONDA, W. dan KANEHISA, M., 2008. Prediction of drug–target interaction networks from the integration of chemical and genomic spaces. Bioinformatics, 24(13), pp.i232-i240.

YANG, W., SOARES, J., GRENINGER, P., EDELMAN, E.J., LIGHTFOOT, H., FORBES, S., BINDAL, N., BEARE, D., SMITH, J.A., THOMPSON, I.R. dan RAMASWAMY, S., 2012. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic acids research, 41(D1), pp.D955-D961.

YOU, J., MCLEOD, R.D. dan HU, P., 2019. Predicting drug-target interaction network using deep learning model. Computational biology and chemistry, 80, pp.90-101.

YUAN, G., YAN, S.F., XUE, H., ZHANG, P., SUN, J.T. dan LI, G., 2014. Cucurbitacin I induces protective autophagy in glioblastoma in vitro and in vivo. Journal of Biological Chemistry, 289(15), pp.10607-10619

Diterbitkan

01-07-2023

Terbitan

Bagian

Ilmu Komputer

Cara Mengutip

Prediksi Interaksi Drug Target pada Gen Kanker Menggunakan Metode Lasso-XGBoost. (2023). Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(3), 531-542. https://doi.org/10.25126/jtiik.20231036603