In the Crimean War more soldiers died from unsanitary conditions than on the battlefield. Victorian lady Florence Nightingale (Florence, 1820-London, 1910), considered the forerunner of modern nursing, demonstrated this from statistics. It is a historical example of how the world of data is present in all life activities to save it, facilitate it, understand it or direct it. Far beyond the algorithms that mark social networks or take us shopping, big data makes today's world possible. They design pipe networks, provide efficiency to industrial processes, distribute traffic or the flow of tourists or are behind the first message we read every morning on the mobile or the first light on. The list is endless. "In a single day," according to Rafael Pardo, director of the BBVA Foundation, "as much information is generated today as in the aggregate in the five millennia since writing took off."
Last Wednesday, the first Society of Statistics and Operational Research (SEIO)-BBVA Foundation awards were presented, which recognize the most innovative contributions of research carried out in Spain in this field. The winners have applied data science to the health field, energy, industry and even to detect unnoticed sexist violence. Jesús López Fidalgo, a 58-year-old from León, president of the SEIO and director of the Institute of Data Science and Artificial Intelligence of the University of Navarra, warns: “Statistics and operational research, with their models and algorithms, are, like the air, everywhere, even if we can't see them”.
Ask. Do algorithms rule our world?
Answer. They are present in our daily actions. Although we do not realize it, there are many decisions that we make automatically, without thinking much about it, but using a simple algorithm. For example, when you go to the supermarket and, at the moment of paying, evaluate if there are people in the previous row with more or fewer products or if they are older or younger. Almost unintentionally, we applied a statistical model to make that decision. If we translate this into more complex problems than choosing between three lines in a supermarket, we need the help of computers and more sophisticated statistical models.
Q. The awards of the society that you chair and the BBVA Foundation cover the use of data in all aspects of our lives.
R. Statistics and operational research, with their models and algorithms, are, like the air, everywhere. Although we do not see them, they are there when we use the mobile, when we open an application or search on Google.
Q. In addition to governing our present, can they predict the future?
A. In fact, it is an important part of modelling: trying to predict. It is not knowing what is going to happen to me tomorrow, but rather making general predictions, for example, to help the rulers to make one decision or another with the general good in mind.
The power of data depends, to a large extent, on what we let them do
P. And will the algorithms be able to convict in a trial or determine a possible recidivism?
A. The power of data depends, to a large extent, on what we let them do. Very important ethical problems arise here and I would say that they have not been fully resolved because this is going too fast. Lawyers are a bit behind and those who work on ethical issues, too. But I insist that, to a large extent, the data does what we let it do. I often say that everything a computer can do could be done by a human as long as they had enough time and memory. The information revolution allows us to make these calculations and make decisions very quickly, but there is a philosophical problem for which it is difficult to give an answer. A self-driving car can make decisions that, perhaps, we would not have made. This is an example of how data can take us where we don't want to go. But human responsibility will always be there.
Q. Because numbers have no morals.
A. Indeed. In fact, so-called biased algorithms lead to unwanted decisions, such as hiring a person based on a certain gender. In the end, it is enough to investigate a little and enter the bowels of the algorithm to realize that it is doing nothing other than what society does: if society is sexist, it will make a decision with that bias. We learn from that and we are trying to unravel it, from a mathematical point of view, to explain why they make some decisions and not others.
So-called biased algorithms lead to unwanted decisions, such as hiring a person based on a certain gender
P. Some examples cast doubt on data science, such as electoral polls or many of the pandemic prediction models, which have been proven wrong.
R. The first thing to say is that, when the human factor intervenes, making predictions is very difficult because we are quite unpredictable. When it comes to physical processes it is easier. Knowing how a tumor is going to evolve or how a machine is going to behave is, to a certain extent, simple. But when the human factor intervenes, things get very complicated. In addition, any result that is published from electoral polls later influences the vote. We could say that a poll shows the opinion of the voters at the time it was taken, but then it changes because of the poll itself or for other reasons. Although I don't like the name, it is important to cook the data. It does not mean manipulating but making adjustments to some estimates to correct the lack of representativeness of a sample, so that they better adjust to reality. On covid, there has been a fundamental problem with the quality of the data. There is no need to blame anyone. It's just that he caught us out of the blue. The priority has been to cure people and prevent rather than collect quality data in the face of something unknown and as new as the pandemic has been in a globalized world with mobility like never before in history. We have all been overwhelmed, especially the health workers and those who should make the decisions.
When the human factor intervenes, making predictions is very difficult because we are quite unpredictable. When it comes to physical processes it is easier
Q. Can the data manipulate us?
A. It is very easy to manipulate data using statistical models or optimization algorithms in a perverse way. But it is easier to manipulate without data. As long as there is data, we can do something, but without it we can manipulate whatever we want.
Q. To understand God you have to study statistics, as Nightingale said?
R. What is clear is that it helps us to better understand the world, nature and how things work. She was a very religious person and dedicated herself to nursing and statistics. What I was saying is that if we want to understand why God has made this world and why he has made it this way, statistics can help us.
Nightingale said that if we want to understand why God has made this world and why he has made it this way, statistics can help us.
P. You also turn to George Box, who said: “The more I work, the luckier I am”
R. Picasso also said that inspiration always caught him working. One can think that it comes walking through the garden or on the subway, but inspiration often comes from working on things that do not have a clear imminent application.
Q. Therefore, chance does not exist, as your book is titled.
R. This title comes from a little anger because they refused to put statistics in the title and I said: 'well, put Chance doesn't exist'. It is the beginning of the famous phrase attributed to Einstein and which follows: 'God does not play dice'. The key is that those who work, in the end, have their reward and that, when things are studied, they are better understood and what we used to call chance we are now able to explain.
Q. Does the stat have no adverse effects?
R. An example would be the insurance companies, which have always used a lot of statistics because, the better we know what can happen to a person, the better we can predict the probability that they will have an episode that involves a high cost for the company. . This leads to many people not getting insurance. In Spain we have the advantage of having a universal health system, but in other countries, like the United States, sometimes there are terrible situations of lack of care due to not having insurance. It is very good that the insurance company does the statistics to adjust its premiums, but the principle of solidarity must prevail so that, together, we can help someone who needs it.
Our data is used to feed the models. The perverse thing is that they are used against us, but statistics do not intervene there
Q. It also has positive effects…
R. They must be highlighted above all else. When there are complaints about a supposed permanent control, we must bear in mind that our data is used to feed the models. The perverse thing is that they are used against us, but statistics does not intervene there, which is a science of the population, not of the individual, although later it affects the good of each one. There is a very important legal and ethical issue, obviously, but it is not a consequence of that of statistics.
Q. It can be a tool or a weapon.
R. Yes. Once again, the human capacity to use that in one way or another is decisive.
P. It is clear that it is a science on the rise because in mathematics there is no unemployment.
R. There never was. Lately we have more students in mathematics and statistics, but not enough, and we have a very important problem because not too many make it to our degrees, despite the fact that lately there has been a boom. We also have the additional problem that we do not have enough people to train them. In addition, statistics and operational research are hardly explained in non-university education and this has consequences when choosing a career. It is very difficult for someone to choose something that they do not know. In Secondary and Baccalaureate statistics are hardly relentless, with exceptions, and thus it is difficult to make it attractive. Professionals from other areas are also the ones who end up giving this training and I think that a statistician will always be able to teach this science with more enthusiasm. In addition, what is not evaluated is devalued and in practical terms it could be said that statistics are barely evaluated in university entrance exams. It is an essential question. We have a contest called Incubator of surveys and experiments aimed at teachers and students and it's amazing what even first year ESO kids are capable of doing when given the opportunity.
You can write to us at rlimon@elpais.es, follow EL PAÍS TECNOLOGÍA on Facebook and Twitter and sign up here to receive our weekly newsletter