Data Wrangling

Primary supervisor

Contact admissions office

Other projects with the same supervisor

Funding

  • Competition Funded Project (European/UK Students Only)

Project description

Data wrangling is the process by which the data required by an application is identified, extracted, cleaned and integrated, to yield a data set that is suitable for exploration and analysis. Although there are widely used Extract, Transform and Load (ETL) techniques and platforms, they often require manual work from technical and domain experts at different stages of the process. When confronted with the 4 V's of big data (volume, velocity, variety and veracity), manual intervention may make ETL prohibitively expensive. As a result, we are interested in providing cost-effective, highly-automated approaches to data wrangling; this involves significant research challenges requiring fundamental changes to established areas, including data integration and cleaning, and to the ways in which these areas are brought together. To enable well-informed decisions to be made by automated techniques, we propose to investigate comprehensive support for context awareness within data wrangling, building on adaptive, pay-as-you-go solutions that automatically tune the wrangling process to the requirements and resources of the specific application.

▲ Up to the top