With the advances in monitoring devices especially drones, remote sensed imageries and sensors, internet of things (ioT) and apps on mobile phones, the amount of data in the world is growing exponentially. By 2020, Cisco estimates there will 40 zettabytes of data in existence. Some of this data will be raw, unstructured, which may be partly turned into “intelligence” that could be valuable for creating knowledge, making decisions and assessing their effectiveness.
Today, considerable environmental data is available in the public domain making us feel that we are data rich. But we still have poor insights as data isn’t structured, harvested and analysed efficiently.
Environmental monitoring brings in data that is multivariate, ever increasing in volumes and at the same time fuzzy. Much of the environmental data is still collected in silos and not as cohorts in campaigns on a scale it deserves.
There are more organizations today who increasingly spend resources on collecting data without paying attention to ensure that the data is of quality. India’s environmental regulators are glaring examples of such organizations. The data collected is hardly analysed, shared and used for decision making. My direct experience of working with these regulatory institutions is dismal. It is horrifying to see that millions of rupees are spent with no return on the investments made in environmental monitoring.
An important tactic to get the most value out of your data is data virtualization. Virtualization allows organizations to link data in different formats across different platforms and protocols. Users can build a visual representation of data patterns that encourages interrogation of the data. Tools like Artificial Intelligence powered infographics and GIS are useful here.
Virtualization provides a fast way of detecting trends, identifying hot spots and exploring associated data sets to find correlations. Environmental impact assessment and State of Environmental Reporting should ideally use techniques of virtualization. Unfortunately, we don’t. The environmental regulators in the country do not have such capacities and often no interest in data virtualization. Questions such as What purpose do these data serve? and how will these data show us whether we are safeguarding our environment? are not discussed.
My Professor Friend feels that India needs an independent National Environmental Monitoring Organization (NEMO).
“Why a NEMO Professor?” I protested. “We already have enough baggage of existing environmental organizations that are not serving their purpose?”
“Well Dr Modak, please understand that these organizations do not look at environment at as a “system”. Environmental monitoring is still visualized in a narrow perspective and in silos. For example, it is not enough that we monitor only ambient air quality in cities and do not link with data on health and epidemiological surveys by concurrent and cohort monitoring. We should be addressing multi-media i.e. sulphur dioxide in air is just a partial description of the sulphur pathway – as we need to know sulphur in the soil and sulphates in water to understand the sulphur flux. Only then we will understand the slow but dangerous chain of environmental impacts that are not easily foreseen. Its eco-system monitoring that I am asking for.”
I understood Professors point.
I said “Professor, in that case, we will need to engage and connect all water data generating and associated institutions to understand what is happening to the Ganga basin. This may include ministry of water resources, ministry of environment, forests and climate change, ministry of agriculture, ministry of health etc. This list will be large”
Professor smiled. I continued
“So if I wanted to understand the particulates in the air then I probably will need to pull in data on atmospheric turbidity, particle concentration on various sizes e.g. PM2.5 etc, emission sources (local as well as regional), meteorology, (wind speed and directions, number of wet days, rainfall), chest illness reporting, sale of inhalers and masks, instances of vegetation injury (e.g. loss of chlorophyll), indoor air quality etc. Indeed, environmental data is much more than conventional pollution data”
Professor took a deep puff after lighting his cigar and said.
“Well Dr Modak, remember that much of this data will be with different data custodians, in different formats, collected at different frequencies and even following different monitoring protocols. You need a Supra Organization to streamline, set protocols, ensure quality, analyse, interpret and report to stakeholders the status, trends, concerns and future ahead to take management decisions. This organization should also address the “data gluttony” i.e. stop colleting data that does not make sense! This Supra organization should be NEMO. Today, we are living in a chaos of useful, useless and poor-quality environmental data”
I could now understand his argument.
I recalled that one of my friends in the US had once said that environmental and associated regulations are the major drivers of environmental data generation, processing and use. He cited a list of relevant environmental regulations in the United States such as
- Atomic Energy Act
- Clean Air Act
- Clean Water Act
- Coastal Zone Management Act
- Comprehensive Environmental Response Compensation and Liability Act (CERCLA)
- Emergency Planning and Community Right to Know Act (EPCRA)
- Endangered Species Act
- Federal Food, Drug and Cosmetic Act
- Federal Land Policy and Management Act
- Federal Insecticide, Fungicide and Rodenticide Act
- Food Quality Protection Act
- Fisheries Conservation and Management Act
- Marine Mammal Protection Act
- National Environmental Policy Act (NEPA)
- Oil Pollution Act
- Resource Conservation and Recovery Act (RCRA)
- Safe Drinking Water Act
- Surface Mining Control and Reclamation Act
- Toxic Substances Control Act
- California Environmental Quality Act (CEQA)
- California Global Warming Solutions Act (Assembly Bill (AB)
All these regulations generate billions of environmental data every year.
I realized that a similar situation exists in India. But will NEMO be a solution? I wasn’t sure.
Professor saw that I was a bit stunned and lost in thoughts to comprehend such “data drowning”. He got up from this chair to fetch his ashtray. He looked outside the window and said
“Dr Modak today is the era of crowdsourced and participatory “big data” that is now placed in the front of judiciary for environmental justice. The burden of scientific proof of environmental harm falls on affected communities, not polluters. The environmental justice movement has worked with ‘citizen–expert alliances’ to make credible scientific claims about environmental exposures in their communities. This is a new source of rapidly expanding data, asking for environmental justice and sometimes questioning data from the regulators. So, we have two major categories, institutionally generated data (by regulators, corporations, states, and other institutions); and citizen-generated (crowdsourced or mined) data. Often the two sources of data lead to contradictions making the task of judiciary very difficult”.
He then pointed to a picture on the wall. It was a picture of an orchestra.
“Do you know the conductor Dr Modak?” Professor asked
Its Zubin Mehta, Professor. I answered. That was easy.
“So, its Zubin Mehta I am talking about. We need a NEMO like him to play the sweet music of environmental data – music that is creative, integrated, well scripted and coordinated that can inspire!
He got up to go and ended the conversation saying
“The outcome will then be music and not noise!”