David de Souza
The Lessons from Data Science
Updated: Feb 28
"While metrics are important in measuring "success" in any endeavour including life, they hide the extreme complexity of all of the many choices and interactions that produced any given result."
Gary M is a data scientists. I asked him 6 questions about his field and what lessons and principles can be taken from data science and applied to other areas in a multidisciplinary way:
1. What life lessons have you learned from data science?
Data science is all about numerical abstraction. In data science we measure the accuracy of our models by condensing complex ideas into just one or at most a few numbers. While metrics are important in measuring "success" in any endeavour including life, they hide the extreme complexity of all of the many choices and interactions that produced any given result. For people, this includes numbers such as one's time spent in education, current income, bank balance, and the number of FaceBook friends and likes one has.
2. What do you consider ‘beauty’ in your field of work?
When the AI models produced, perform better than theory or even prior practice predicts. There are the rare days when everything seems to come together and my results exceed any reasonable expectations. Sometimes AI models produce human-like qualities of intuition, and then measurably improve at a rate where they outperform anything a human could ever do unaided. As a future-focused data scientist, I can't help extrapolating into the future of error-free self-driving cars, highly personalised AI assistants, AI-generated metaverses, and other completely unimaginable AI technologies that will revolutionise our human experience more than the Internet has over the past two decades.
3. What are the main principles of data science? What principles from data science can be applied to other fields?
Data science is changing faster than almost any other scientific field. 100's of research papers are published every week. Good data science is 50% scientific method extracted from the most recent research and 50% intuition in choosing which of the many possible experiments to try next. A lot of my data science feels like pure scientific research. Much of my time is spent in systematic trial and error, slowly building up an understanding of what tools and techniques I need to use to solve my problem. I've managed to generate results that are "too good to publish", only to then have to spend a month verifying I haven't made a single simple mistake. Other fields tend to be more bound by tradition and "what worked yesterday". Data science reinvents itself every year.
4. What small things make a big difference in data science?
"Know your data." A talented data scientist has an intimate understanding of her raw material - data - that she is working with. She needs to systematically review, clean, organise and arrange her data extremely thoughtfully. The choice of which of the latest and greatest data science tools to use is actually less important than the many subtle choices of how to present one's data to those same tools. The combination of a careful and comprehensive understanding of one's data and then the choice of the optimal tools then makes for leading-edge data science.
5. What is the biggest misconception or the biggest mistake that people make about data science?
As a general rule, things that humans do well are hard to solve with data science. Conversely, the things that humans do poorly aren't often particularly difficult to solve with computers. For the foreseeable future the combination of data science prediction with human judgment will be both necessary and complementary.
6. Which single concept from data science deserves to be more widely known?
Exponential acceleration in computing technology. Today's tools are being used to design tomorrow's tools. As long as the tools continue to improve there are no theoretical limits to what data science can achieve. Even data scientists find it difficult to predict what will happen in a few years time. A good data scientist is a humble data scientist when predicting the future.