SNOMED CT Annotation for Improved Pathological Decisions in Breast Cancer Domain
G. Johanna Johnsi Rani1, Gladis D2, Joy John Mammen3

1G. Johanna Johnsi Rani, Department of Computer Science, Madras Christian College (Autonomous), University of Madras, Chennai, India.
2Gladis D, Principal, Bharathi Women’s College (Autonomous), University of Madras, Chennai, India.
3Joy John Mammen, Professor and Head, Department of Transfusion Medicine & Immunohaematology, Christian Medical College, Vellore, India.

Manuscript received on 02 August 2019. | Revised Manuscript received on 07 August 2019. | Manuscript published on 30 September 2019. | PP: 8400-8406 | Volume-8 Issue-3 September 2019 | Retrieval Number: C6519098319/2019©BEIESP | DOI: 10.35940/ijrte.C6519.098319

Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Breast cancer pathology reports are used in the diagnosis of the disease and determination of the stage of cancer in a patient. These reports are written or electronically generated by the Pathologist in English. The contents of a Pathology report generated by the Pathologist are usually in unstructured natural language form. The contents of a report are used to determine the Pathological classification and Cancer stage of a patient. Information extraction and making pathological decisions from natural language text is a complex process due to the heterogeneity of the report structure and its contents. The reports can be homogenized using the global annotation standard Systematized Nomenclature of Medicine – Clinical Terms, SNOMED-CT. It enables consistent representations of medical terms and can be used for clinical decision support systems (CDSS) and cancer reporting. SNOMED is a vast repository and its enormity and complexity necessitates extraction of a subset for a particular domain before using it for annotation. The annotation is performed either in online mode at the time of generation of the report or in offline mode on a batch of archived reports. A CDSS prototype is developed for breast cancer domain, which provides support to the Pathologist to determine the Pathological Classification and Cancer Staging on both natural language text and SNOMED-annotated text. With regard to Pathological decisions, a hypothesis is formulated that Annotation using SNOMED does not improve the system’s performance in determining the cancer stage of a patient. For annotating the text, the system initially extracts a SNOMED subset for the domain. Performance Analysis of the decision support processes was done by determining Precision, Recall, Specificity, Accuracy, F-measure and Error. The analysis indicates that the annotation feature improved the accuracy of automated Pathological decisions presented by the CDSS to the Clinician for finalizing his decisions. In the future, the CDSS feature can be applied to other cancer domains and thus provide a means to improve decision-making related to those domains.
Keywords: Clinical Decision Support System, SNOMED-CT, Natural Language Processing, Information Extraction, Breast Cancer Pathology

Scope of the Article:
Natural Language Processing