Mixture Model Development for analysis Cure Rate Breast Cancer with Missing Data

ABSTRACT: This research aims to develop a mixture model for the cure rate analysis with missing data, by way of: ( 1 ) determine the mixture models for the analysis of the cure rate with missing data, ( 2 ) to estimate the parameters of mixture models for the analysis of the cure rate with missing data, ( 3 ) study of the properties of estimators obtained in mixture models for the analysis of the cure rate with missing data, and ( 4 ) apply the mixture models for the analysis of the cure rate of patients with breast cancer ( PBC ) with missing data based on real data . To cure rate analysis with missing data, developed a model mixture by combining the density function of life time of cure patients and density function of life time of uncured patients. In this case, life time of cure patients is the observed data , whereas life time of uncured patients includes missing data. Thus in the mixture models, the density function of complete data can be formulated as a multiplication of density function of the observed data with the conditional density function of the missing data given the observed data. Estimation of parameters in the mixture model for the cure rate analysis with missing data based on the EM (Expectation Maximization) Algorithm, which is an iterative approach to analyzing the missing data. EM algorithm includes two steps: the E-step and M-step. In the E-step, determined expectations of log likelihood function of complete data. In the M-step is done to maximize expectations of the log likelihood function of obtained in the E-step, in order to obtain estimators of parameters in the model. Baseline survival function can not be eliminated on the EM algorithm. To estimate the baseline survival function is used assuming proportional hazards (PH) as used in the Cox PH models. Baseline survival function is used to calculate the survival rate at certain times and in accordance certain characteristics. The EM algorithm is guaranted to converge to a local maximum of the likelihood function, with each iteration increasing the log likelihood. Rows of log likelihood is convergent if the log likelihood is bounded above. Estimator in mixture models for the analysis of the cure rate with missing data obtained via the EM algorithm converges, although to achieve convergence is very slow . Mixture models for analysis PBC who seek treatment at RSUP Dr. Sardjito Yogyakarta in 2004-2009 is Weibull mixture models . In this model combined 2 (two) Weibull distribution with 2 (two) parameters as a density function of life time of the cured PBC, and the density function of life time of uncured PBC containing missing data. To analysis the data was performed aided Maple software and iteration programs developed Matlab software assisted. Based on the results of this study found several open issues that need further investigation ( 1 ) needs to be developed other software related numerical iteration on the model parameter estimation in mixture with certain lifetime distribution , where the solution maximizing the log -likelihood function ekpekstasi its not closed - form ( 2 ) further research needs to be done to focus the development of the survival analysis for sufficient follow-up test in the analysis of the cure rate .