Data-Driven Automated Cardiac Health Management with Robust Edge Analytics and De-Risking
Remote and automated health management has shown the prospective to significantly impact the future of human prognosis rate. Internet of Things (IoT) enables the development and implementation ecosystem to cater the need of large number of relevant stakeholders. In this paper, we consider the cardiac health management system to demonstrate that data-driven techniques produce substantial performance merits in terms of clinical efficacy by employing robust machine learning methods with relevant and selected signal processing features.
We consider phonocardiogram (PCG) or heart sound as the exemplary physiological signal. PCG carries substantial cardiac health signature to establish our claim of data-centric superior clinical utility. Our method demonstrates close to 85% accuracy on publicly available MIT-Physionet PCG data sets and outperform relevant state-of-the-art algorithm.
Due to its simpler computational architecture of shallow classifier with just three features, the proposed analytics method is performed at edge gateway. However, it is to be noted that healthcare analytics deal with number of sensitive data and subsequent inferences, which need privacy protection.
Additionally, the problem of healthcare data privacy prevention is addressed by de-risking of sensitive data management using differential privacy, such that controlled privacy protection on sensitive healthcare data can be enabled. When a user sets for privacy protection, appropriate privacy preservation is guaranteed for defense against privacy-breaching knowledge mining attacks. In this era of IoT and machine intelligence, this work is of practical importance, which enables on-demand automated screening of cardiac health under minimizing the privacy breaching risk.
It is a well-known fact that deaths due to cardio-vascular diseases (CVD) are the biggest killer of human life. More than 31% of human life loss is due to cardiac-related diseases . However, CVD is preventable when the early warning sign is captured before the disease has manifested internally.
The development and deployment of the computational method for preventive, opportunistic, early-warning cardiac health management ensure better prognosis and probably lower the number of human life loss due to CVD. Subsequently privacy-preserved data management enables higher acceptability to the patient community and related stakeholders.
With the large-scale availability of wearable sensors and powerful smartphones, the realization of the automated cardiac health system in the mobile platform is the need of the hour. In fact, the Internet of Things (IoT) has an important role to play for the realization of affordable cardiac health management solution using artificial intelligence, machine learning and signal processing techniques.
In this paper, the focus is on developing (under IoT infrastructure-based architecture) a data-driven computational model of detecting cardiac abnormality from heart sound or phonocardiogram or PCG signals, where PCG signals are collected from wearable sensors. In fact, capturing of PCG signals through smartphones has been initiated quite a few years back and paved ways for IoT based integration to realize the E-health ecosystem consisting of all the stakeholders like doctors, hospitals, medical caregivers, clinical researchers for immediate, timely, remote investigation and for prompt screening, treatment and diagnosis.
IoT is used as the infrastructure to allow the computational model (clinical inferencing and privacy analytics) to be deployed on the edge devices (like smartphone) or cloud and for the deployment of the E-health system.
We propose predictive modeling in the presence of cardiac abnormality from PCG data, which enables the subject to get immediate medical attention rather than when symptoms surface externally. However, the sensitive healthcare data is to be privacy protected and we need to safeguard against sensitive data breaching risk, which requires to be on-demand, based on user’s choice on privacy protection .
Privacy protection cannot be indiscriminant and in order to shield the possibility of data starvation of few of the stakeholders (we refer them as non-critical stakeholders that include social engineers, medical data researchers, statistical surveyors, etc.), novel data characteristics-based privacy protection is proposed. When data of a user can be found out as ‘one in the crowd’, lighter obfuscation is incorporated, while if that is ‘unique in the crowd’, stronger protection is provided. The proposed scheme is an integrated approach of clinical utility and privacy protection that derives cardiac condition (equivalently a classification task) as well as ensures controlled privacy protection of patient’s sensitive healthcare information.
One of the salient aspects of the proposed scheme is its applicability in the context of edge analytics. In order to warrant the suitability of deployment of analytics solutions in edge devices, we need to satisfy two important criteria:
- When inferential analytics or the training model generation is performed at the edge devices, the model construction need to be lightweight, typically by shallow networks with manageable dimension in the feature space. In fact, analytics on the source or at the edge is required particularly in the absence of private cloud infrastructure due to data privacy and security issues: In this paper, shallow network-based supervised learning with very limited number of feature dimension (precisely, three features) is performed, which invariably satisfies the computational requirement of trained model generation at the edge devices.
- Healthcare data, being sensitive in nature, privacy protection needs to be carried out at the data origin: Our solution is privacy controlled. User or the data owner has the right to privacy preserve her healthcare data in a transparent manner. One of the main criteria of privacy protection for sensitive data is to ensure that utility is preserved. In our context, the utility is described as the amount of information available from the privacy-protected data. More distortion leads to higher protection with lesser utility of transformed data, whereas less distortion invites more privacy attacks. The proposed privacy protection method attempts to obfuscate the sensitive data to ensure adequate protection is made while utility is not severely compromised.
Our main intention is to construct an accurate model of clinical analytics over PCG signals, such that the inference it draws is capable to imbibe confidence to the patient as well as to other stakeholders like doctors, medical caregivers. On the other hand, the important features and inferences provided by the clinical analytics algorithm need to be privacy-protected while sharing with non-critical stakeholders who are not directly involved with the treatment or diagnosis specifically when the user or patient is conservative with respect to her privacy requirements.
Our method not only constructs clinically reliable computational model, but also provides on-demand privacy protection as per user’s requirement. Thus, the proposed privacy protected integrated analytics method is positioned for practical acceptance to both patient and medical communities. The workflow of the approach is:
- Capturing PCG signal locally or through Internet from PCG sensor.
- Analyze and develop the clinical computational model at the edge or cloud from the training PCG signals.
- Deploy the trained model at the edge or cloud.
- Clinical analytics module provides inference as well as distinct features from field (or test) PCG signals.
- User and other stakeholders like doctors, clinical researchers, hospitals access the outcome of the clinical analytics model pro-actively (by entering the analytics platform portal) or reactively (alarms sent by the platform to the critical stakeholders like doctors, hospitals) when inference is ‘Abnormal’.
- User sets the privacy requirement. When privacy requirement is set ‘1’, obfuscation of the distinct features is made for the non-critical stakeholders like clinical researchers and inference is eliminated. Non-critical stakeholders only access the privacy-preserved features without any inference.
Hence, we require
- Powerful analytics method to ensure that cardiac condition is accurately inferred from the PCG signals so that alarm signals fetch immediate medical service for timely treatment.
- Privacy-controlled information sharing with non-critical stakeholders to minimize the privacy-breaching risk of sensitive health information.
This paper is organized as follows. In Section 2, related works and background material is presented, where we find that separate works on clinical analytics and data privacy protection are available with mature research outcome, an integrated approach, a critical requirement is yet not developed. The architecture of the proposed system is described in Section 3. In Section 4, our clinical analytics method is discussed which identifies clinical abnormal subject from PCG signal. In Section 5, novel privacy analytics algorithm is depicted that obfuscates the sensitive healthcare data when the subject demands. In Section 6, the efficacy of the proposed model is demonstrated through extensive experiments over expert-annotated, publicly available MIT-Physionet Challenge 2016 data . Finally, the paper is concluded in Section 7.