A global revolution

Rule of Data
By Dr S. Saraswathi
(Former Director, ICSSR, New Delhi)

Union Health Ministry has no information on vaccine doses procured and administered by private hospitals. Despite reservation of 25% of vaccine allocation to private sector and freedom to purchase vaccine directly from manufacturers, only 6% of the doses administered were in private hospitals. It shows the urgent need to improve our data base for better results.
We are living in the age of information that is called data by researchers from which inferences can be drawn and conclusions reached. The quality, quantum, and coverage of data determine how effectively the decisions arrived at will fulfil our aims and objectives. The value of data is in their use. Every organ, agency, or institution in a democratic set up is ultimately accountable to the public for every decision and action. Hence, it is virtually a global revolution towards the rule of data or Data Raj.
Covid-19 pandemic has taught us the importance of recording correct and complete facts pertaining to prevalence and control of the disease. This lesson on data quality, as the key priority, is a by-product that has awakened us to realise that much of the work in epidemic control centres is around data generation, retrieval, and updating. Disparities in the assessment of Covid-19 cases, availability of vaccines and medicines, number vaccinated, and number of deaths by different agencies raise the question of reliability of available data. Each State has its own mechanism for reporting and its quality is not comparable.
Data quality, it is found in the experience of users, has six dimensions – accuracy, completeness, consistency, timeliness, validity, and uniqueness. Weakness of available data even in one of these aspects will show on the results and make them questionable. But, none of these can be guaranteed by any system of data collection. Our aim itself has to be lowered and kept as elimination of errors, omissions, bias, inconsistencies, delay, and doubts in data. Properties or criteria used must be common in the coverage of different areas or subjects within an investigation.
Public healthcare has presently become the context for widespread discussion on data quality. It is for this reason that the WHO extends its support to member-States to strengthen their capacity to collect, compile, manage, analyse, and use health data accumulated from total population-based surveys like household enumeration and surveys, civil registration systems of vital events like birth and death, and administrative and medical activities of health centres at different levels, total usage of vaccines and medicines, and number of available medical, paramedical and health work force.
Comparisons are possible between regions, countries, States/provinces, districts, zones, wards and so on, on the basis of available data. These help concerned authorities to make appropriate changes and adjustments in the methods adopted for containing the epidemic.
Estimates of death due to Covid-19 by different agencies vary widely. There is gross under-estimation in many nations causing under-preparation for fresh challenges. Given the magnitude of the problem, accuracy of data on Covid-19 death is of paramount importance, but there are still controversies regarding cause of death. Sense of shame in reporting death, and a tendency to claim undue credit for controlling the pandemic and setting models may have led to under-counting and manipulating death data.
Regular and reliable data on existing health facilities are central to quality and availability of health services. Health sector planning is impossible without data. But, public perception that associates sickness only with doctors, hospitals, and medicines has taken a long time to realise that the patient becomes a patient in an environment subject to several socio-economic-cultural influences.
Government of India has released the National Guidelines for Data Quality in Surveys to provide comprehensive guiding principles and best practices for mitigating errors and biases that may occur during designing the project, conducting the surveys and analysing the responses. The initiatives for the guidelines came from the National Data Quality Forum (NDQF) housed at the ICMR. The guidelines are specifically meant for demographic, health and nutrition surveys for advanced data quality monitoring and data analysis and to improve the capacity of data collection agencies. Machine learning techniques are also imparted to guide in the application of guidelines.
Data Raj has come to stay in all policy-making exercises, budgeting, fund distribution, and evaluations besides health and diseases.
Another area clamouring for data is the “Quota Policy”, which links opportunities for education and employment with the numerical strength of castes and communities. The demand for caste data, knowing well the administrative difficulties and complicated nature of caste phenomenon itself to yield a true picture of caste divisions, and the adverse impact of such a census on social unity is a typical instance of misapplication of data rule for political advantage.
Agricultural progress depends much on data base and its availability and understanding by those in this sector. Data based decisions at farm level can enhance resource utilisation and conservation practices which will improve production and lessen costs. Expanding this to regional level will help formulation of long-term policies. Union Minister for Agriculture and Farmers’ Welfare has been emphasising the importance of preparing farmers’ database. Since agriculture is not an organised sector, collection of data from individual land owners involves tremendous effort. But, the outcome will be as much valuable.
Quality Control (QC) of data is given primary importance in the US. It refers to application of methods or processes that determine whether the data collected meet the requirements for reaching the set goals and meeting the quality criteria prescribed. It is necessary in any planning anywhere.
Censuses, Sample Surveys, and administrative data are the three main sources of data in modern countries. In India, most important government sources of data include the Reserve Bank of India, Ministry of Statistics & Programme Implementation, Survey of India, India Weather Data, and National Portal of India. Census, National Sample Survey, National Rural Health Survey, Election Commission, National Crime Research Bureau are producing enormous and continuous data. That they provide basis for short and long term planning and decision-making on related issues for the government needs no mention. The National Data Bank of Socio-Religious Categories has also been developed.
The first major data treasury built on scientific principles is the Survey of India set up in 1767 for “exploring the unknown territory” by English expansionists. It has grown as a rich resource for geospatial data and the nation’s principal mapping agency. Countrywide census operation began in 1871. Along with this, ethnographic surveys have been conducted to study the population.
Data may be both quantitative and qualitative in the form of statistics or documents and records. Court records and judgements, administrative reports and budget papers, proceedings of parliament and legislatures, reports of events and speeches and many such records provide immense data. These have to be authentic and unbiased versions to be usable for policy-making or research.
The world is going through a data revolution touching creation, collection, preservation, and use in every sphere. Just as authentic data lead to improvement and progress, circulation of fake and false data will lead us to decline and fall. — INFA