Data Trends - Repository by SynerScope

AI models are increasingly applied in high-stakes domains like health and conservation. Data quality carries an elevated significance in high-stakes AI due to its heightened downstream impact, impacting predictions like cancer detection, wildlife poaching, and loan allocations. Paradoxically, data is the most under-valued and de-glamorised aspect of AI. In this paper, we report on data practices in high-stakes AI, from interviews with 53 AI practitioners in India, East and West African countries, and USA. We define, identify, and present empirical evidence on Data Cascades---compounding events causing negative, downstream effects from data issues---triggered by conventional AI/ML practices that undervalue data quality. Data cascades are pervasive (92% prevalence), invisible, delayed, but often avoidable. We discuss HCI opportunities in designing and incentivizing data excellence as a first-class citizen of AI, resulting in safer and more robust systems for all.
Download PDF

All of the hype around dark data and AI-driven dark-analytics is missing the first crucial step, which is to identify what is real and what is useful, and for what application. While it may sound exciting, using AI to automatically process vast amounts of data in ultra-fast operations is not going to provide much insight.
We have to remove and classify this noise which, not unlike the Higgs Boson, is an elementary particle of information that is unable to be examined using existing knowledge. Once we get to the “good data,” that is data that is useful to analytics and thus for gaining insight, only then can we use targeted applications and methods to not only yield insights, but manage for potentially wrong conclusions–and consequently, make better decisions.
Professor David Hand
Learn more...

Sharing Kowledge

Repository

(Dark) Data: Market Trends

The AI with a two track mind

European public sector seeks multi-cloud approach to services

Large Language Models as Zero-shot Labelers

How Data Governance is Central to Effective Data Analytics

OpenAI Is Now Everything It Promised Not to Be: Corporate, Closed-Source, and For-Profit

Are You Making These Deadly Mistakes With Your AI Projects?

Data ethics: What it means and what it takes

Why Analytics Is So Hard

NAO guide for senior government leaders flags barriers to better data use

Venture Investors See Potential In Data Observability

The Overestimation of neural networks and deep learning

Fixing the GDPR: Towards Version 2.0

How AI, data analytics could help identify insurance fraud

Companies are losing revenue and customers due to AI bias

Dark data creates a black hole of carbon emission: report

Bringing Dark Data to Light: How to Handle the Next Great Business Resource

“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI

Avoiding Data Disasters

Data Classification Market to Grow at a CAGR of 27.6 % to reach US$ 3,577.4 Million from 2020 to 2027

5 Challanges of Big Data Analytics in 2021

Blog: Dark Data (European Mathematical Society)

Illuminating Insights Hidden in Dark Data

Blog: Dark data – the hidden gold

Presentation: Control Dark Data in the New Age of Compliance and Security

Book: Dark Data – Why what you don’t know matters

What is dark data? (a clear explanation)

Dark Data: Why What You Don’t Know Matters

Lighting up Dark Content

The Parallel Universe of Dark Data and Dark Matter

Dark Data. Bright Light

About Synerscope

Contact Us

Products

Solutions

SynerScope