The management and enrichment of collections of data sets intended to facilitate data accessibility.
Instead, the future of data management lies in “data curation,” which he describes as being “aimed directly at the ‘long tail’ – the hundreds or thousands of data silos not captured within the traditional data warehouse, and which can only be captured and integrated at scale by applying automation and machine-learning based on statistical patterns.
Data curators interact frequently with the data catalog and fill a critical role in making it useful and valuable. Evaluate the richness of curation capabilities including the ability to add data sets, hide or remove data sets, add annotations, create metadata, add search terms and tags, identify stewards and SMEs, tag security- and compliance-sensitive data, share tips and techniques, and encourage crowdsourcing of metadata.
Data curation is the work of creating and managing collections of datasets that are of interest to data consumers. The goals of curation are to make data available to those who need it, to promote data sharing, and to discourage unnecessary data redundancy and replication.