Repository

(Dark) Data: Market Trends

If you are interested in knowing what is happening in the world of Dark Data, Data Compliance, AI, Data Quality and other data issues, feel free to browse the information below.

Here you can read materials written by independent writers. This should give you a good insight into why Ixivault, our app to uncover your (dark) data, fits within this market and is a solution to your (dark) data problems.

Enjoy the read!

The AI with a two track mind

Speaking to a group of analysts recently, a product marketer for one of those vendors at the forefront of generative AI app assistance suggested that without the use of real business data to back it, the technology in enterprise terms is “largely a parlour trick”. We agree.
Learn more...

European public sector seeks multi-cloud approach to services

By allocating different workloads to different cloud providers, governments can create digital ecosystems for innovation that are both flexible and safe.
Although there had already been a growing call among Europeans for digital sovereignty, the war in Ukraine and its geopolitical impact have accelerated the need for governments to control their technology, operations and data.
Learn more...

Large Language Models as Zero-shot Labelers

Labeling data is a critical step in building supervised machine learning models, as the quantity and quality of labels is often the main factor that determines model performance.
However, labeling data can be very time-consuming and expensive, especially for complex tasks that involve domain knowledge or reading large amounts of data.
Learn more...

How Data Governance is Central to Effective Data Analytics

In the publishing industry, there are a lot of things we can measure. However, if there is no strategy underlining how and why we collect data and who can access it, the value is lost. Not only that, but we can put our business at serious risk of non-compliance.
Ultimately, data governance is central to effective data analytics and underpins everything we do.
Learn more...

OpenAI Is Now Everything It Promised Not to Be: Corporate, Closed-Source, and For-Profit

OpenAI is today unrecognizable, with multi-billion-dollar deals and corporate partnerships. Will it seek to own its shiny AI future?
There's no question that OpenAI's generative AI is now big business. It wasn't always planned to be this way.
Learn more...

Are You Making These Deadly Mistakes With Your AI Projects?

Too Much of the Wrong Data, and Not Enough of the Right Data is Killing AI Projects, the title of a great blog by Kathleen Walch of Cognilytica in Forbes just recently I posted a blog in the same vein by Paul Mah "B2B tech storyteller". So from NewYork to Amsterdam to Singapore the message is ‘know your data, know it well’. #ai #opengovernment #gdpr #archiving all demand that organizations get to know their #data much better and stay on top of their data. Know what’s in the data you’re using and before you pass any of it through AI or face the risks if you don’t.
Learn more...

Data ethics: What it means and what it takes

Every company must establish its own best practices for managing its data. Here are five pitfalls to avoid based on our conversations with experts and early adopters.
Learn more...

Why Analytics Is So Hard

Data has become synonymous with modern life. But as many organizations quickly realize, data-driven success doesn’t come by wistful thinking or corporate declarations, but by focused, sustained investment and effort.
Why is finding success with analytics so difficult? And what are some strategies that organizations adopt to get ahead?
Learn more...

NAO guide for senior government leaders flags barriers to better data use

A National Audit Office guide for government chiefs on improving data use points to difficulties in achieving data sharing benefits and laments variability in cross-government data quality.
Learn more...

Venture Investors See Potential In Data Observability

It’s unlikely the term “data observability” gets most folks too excited. Unless they are investors. We’ve seen several startups in the space raise serious cash over the last couple months.
In fact, in the span of one week three companies alone raised more than $400 million. This shows the significance of a sector that helps companies review, evaluate, index and, in general, control what has become the lifeblood of so many large enterprises—their data.
Learn more...

The Overestimation of neural networks and deep learning

Neural networks and deep learning are overestimated in terms of what they can do. They have been handy for natural language processing. However, they see a diminishing marginal utility of the enormous investment made in them as the primary AI technique of the third bubble in AI.
Learn more...

Fixing the GDPR: Towards Version 2.0

The promises of the General Data Protection Regulation (GDPR) are manifold. It is supposed to protect privacy and guarantee the selfdetermination of the individual. It is supposed to put digital gatekeepers in their place. It is supposed to be a bulwark against the surveillance state and surveillance capitalism. The law is - for its advocates - the new gold standard for data protection. If you are trying to make an honest assessment of the GDPR three years after its application, you will however also hear very different views.
Download PDF...

How AI, data analytics could help identify insurance fraud

People are becoming more sophisticated in perpetrating fraud—filing claims online and operating from around the world, people are approaching fraud digitally and at higher frequencies. Fortunately, there are new methods, using artificial intelligence, that eases the burden and helps insurance companies stay one step ahead.

The right AI can help insurance companies detect fraud as it occurs, and can help connect data sets that would usually be siloed.
Learn more...

Companies are losing revenue and customers due to AI bias

New survey finds that 80% of U.S. firms found problems despite having bias monitoring or algorithm tests already in place. These same organizations are already feeling the impact of this problem as well in the form of lost customers and lost revenue. In addition to measuring the state of AI bias, the survey probed attitudes about regulations. Surprisingly, 81% of respondents think government regulations would be helpful to address two particular components of this challenge: defining and preventing bias.
Learn more...

Dark data creates a black hole of carbon emission: report

Massive sprawls of dark data pollute data centers worldwide; deleting data waste could help to reduce the carbon footprint of digitalization.
Digitization can be part of the solution to climate change but storing digital data that is never used can also consume an enormous amount of energy and, as a result, produce carbon dioxide that need never have been wasted.
According to a report by Veritas about 5.8 million tonnes of CO2 will be unnecessarily pumped into the atmosphere as a result of powering the storage of this kind of data this year alone. In order to protect the planet from this waste, businesses need to get on top of their data management strategies, use the right tools to identify which data is valuable, and rid their data centers of ‘dark data’.
Learn more...

Bringing Dark Data to Light: How to Handle the Next Great Business Resource

Dark data is a hot topic in the field of data management. Many perceive it as scary and aren’t sure where to start in making it something of value. To back up, dark data is defined as data collected during business operations that otherwise goes unused. This unmanaged content is difficult to monitor, meaning it’s hard to notice when information has been replicated, leaked, tampered with, lost, or stolen. It’s easy to understand the ominous nature of the discussion around it.
Learn more...

“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI

AI models are increasingly applied in high-stakes domains like health and conservation. Data quality carries an elevated significance in high-stakes AI due to its heightened downstream impact, impacting predictions like cancer detection, wildlife poaching, and loan allocations. Paradoxically, data is the most under-valued and de-glamorised aspect of AI. In this paper, we report on data practices in high-stakes AI, from interviews with 53 AI practitioners in India, East and West African countries, and USA. We define, identify, and present empirical evidence on Data Cascades---compounding events causing negative, downstream effects from data issues---triggered by conventional AI/ML practices that undervalue data quality. Data cascades are pervasive (92% prevalence), invisible, delayed, but often avoidable. We discuss HCI opportunities in designing and incentivizing data excellence as a first-class citizen of AI, resulting in safer and more robust systems for all.
Download PDF

Avoiding Data Disasters

Things can go disastrously wrong in data science and machine learning projects when we undervalue data work, use data in contexts that it wasn’t gathered for, or ignore the crucial role that humans play in the data science pipeline. A new multi-university centre focused on Information Resilience, funded by the Australian government’s top scientific funding body (ARC), has recently launched. Information Resilience is the capacity to detect and respond to failures and risks across the information chain in which data is sourced, shared, transformed, analysed, and consumed.
Learn more...

Data Classification Market to Grow at a CAGR of 27.6 % to reach US$ 3,577.4 Million from 2020 to 2027

The data classification is a tool that classifies enormous data into varied tabs to maximize most out of it while maintaining the privacy of confidential data. The commercialization of data protection laws such as GDPR and Health Insurance Portability and Accountability Act (HIPAA) is directly linked to impact the scope of data classification worldwide. Growing business data and cloud migrations, ramp up of personal data identification, threat to privacy in email, leverage of machine learning, and converge of data management and data protection are among the factors that play a significant role in accelerating the scope of data classification.
Learn more...

5 Challanges of Big Data Analytics in 2021

In today’s digital world, companies embrace big data business analytics to improve decision-making, increase accountability, raise productivity, make better predictions, monitor performance, and gain a competitive advantage. However, many organizations have problems using business intelligence analytics on a strategic level. According to Gartner, 87% of companies have low BI (business intelligence) and analytics maturity, lacking data guidance and support. The problems with business data analysis are not only related to analytics by itself, but can also be caused by deep system or infrastructure problems.
Learn more...

Blog: Dark Data (European Mathematical Society)

This blog of the European Mathematical Society, Adhemar Bultheel addresses in what kind of situations we are dealing with dark data? He describes, with many examples, fifteen different phenomena that can lead to dark data.
Learn more...

Illuminating Insights Hidden in Dark Data

How can your organization derive value and reap benefits from the dark data using advanced analytics and techniques like AI, Deep Learning, and NLP?
In this post, by Vivek Sinha Vice President, Technology of Globallogic, he explores the benefits and challenges of activating dark data and share examples of how dark data is delivering meaningful value.
Learn more...

Blog: Dark data – the hidden gold

A blog on Similarly, dark data is data that sloshes around in our databases, file systems and clouds without being put to any use because we either don’t have the means to mine it, or we don’t even know it exists.
Read the blog...

Presentation: Control Dark Data in the New Age of Compliance and Security

A Bloomberg presentation on dark data and the risks associated with it.
Download presentation...

Book: Dark Data – Why what you don’t know matters

This book explores the Achilles heel of data science. Going beyond the data you have, it examines the data you don’t have, illustrating with many real-life examples how lack of awareness of what you are missing can lead to distorted understanding, incorrect conclusions, and mistaken actions.
Learn more...

What is dark data? (a clear explanation)

We've heard of big data or small data, but what is this concept of dark data?
Learn more...

Dark Data: Why What You Don’t Know Matters

In this talk, Professor David Hand will speak about his book Dark Data: Why What You Don’t Know Matters. The book explores how lack of awareness of what you are missing can lead to distorted understanding, incorrect conclusions, and mistaken actions.
Watch video...

Lighting up Dark Content

The data science community likes to speak of dark data and how they, as experts, can bring tremendous insight to any business by simply shining their analytical light on those hidden bits, bytes, and characters.
So, what constitutes dark data?
Learn more...

The Parallel Universe of Dark Data and Dark Matter

Scientists estimate that 95% of the matter in the universe is dark…..Evidence now reveals that 90 percent of all data across the enterprise is also dark. Often meticulously gathered and collected to meet changing compliance mandates, it is invisible and little used past its creation, taking up terabytes of space that is difficult to access and analyze.
Learn more...

Dark Data. Bright Light

All of the hype around dark data and AI-driven dark-analytics is missing the first crucial step, which is to identify what is real and what is useful, and for what application. While it may sound exciting, using AI to automatically process vast amounts of data in ultra-fast operations is not going to provide much insight.
We have to remove and classify this noise which, not unlike the Higgs Boson, is an elementary particle of information that is unable to be examined using existing knowledge. Once we get to the “good data,” that is data that is useful to analytics and thus for gaining insight, only then can we use targeted applications and methods to not only yield insights, but manage for potentially wrong conclusions–and consequently, make better decisions.
Professor David Hand
Learn more...