Electronic Health Record system has been highly regarded in the medical world. Using these systems, medical institution may develop a clinical data repository containing extensive records of a large number of patients, which provides them with more efficient retrospective research. The presence of human factors in the process of electronic data recording causes some data quality challenges. Using similarity functions and master data, a data quality engineering framework is developed to solve these problems. The proposed framework is applied to a population based cancer registry program. Finally, some experimental results are presented to show effectiveness of the proposed framework.