The Artificial Intelligence train is moving faster and faster: ChatGPT has released AI for everyone. The result is a hype that also has it’s effect inside the boardrooms and management of companies. Almost every organization or company can and has long benefited from the use of AI already.
These advances from AI make secure and reliable data more important than ever before. Because the datasets used to train AI depend on reliable input, together with knowledge of the business. No AI without HI, human intelligence.
The Netherlands is a forerunner in Europe in terms of cloud migration. The percentage of companies that have already migrated to the cloud is around 44%, with flexibility as the main reason. However, to migrate to the cloud, just like for AI purposes, you need well-labeled data, including dark data: unknown or forgotten data is hereby extremely important.
A parallel can be drawn between the SBS 6 program “your house in order” and “your data in order”. If you store too many things in the cluttered structure of your home, cleaning up will be difficult or impossible.
The existence of certain items is often forgotten, which means that creating order takes too much effort. Help comes from the program makers by way of a sports hall with a team that sorts everything. Read more…
In the virtual home of an organization’s data, you also suffer from a similar messy structure. In general, the contents of folders and documents are quickly forgotten. SynerScope offers the virtual ‘sports hall’ in Microsoft Azure cloud; after which automatic sorting, labeling and tidying becomes a breeze.
For this purpose we, at SynerScope, have developed a visual scanner of data that makes ‘cluster labeling’ possible through AI. By working with data in entire groups at the same time, we greatly accelerate the labeling process.
Data, even without meta-data, i.e. dark data or forgotten data, is automatically sorted by content. Possible labels are calculated for each cluster based on the content, after which a domain expert chooses from these labels. The data generates the labels, and people choose from them.
In this way, well-labeled data increases the quality of the AI outcome, with less risk of bias or unethical data use. And with less resource effort and costs. But the direct costs are also lower by wasting less expensive compute power on what may later turn out to be worthless and uninformative data.
As an organization you have your own data and knowledge under control, so focus on that. This allows you to tame AI and use it for your goals, with more success, less risk and at lower costs.