Stats NZ faces a further challenge: shoring up its Integrated Data Infrastructure
- 15 April, 2019 08:45
Liz MacPherson (Stats NZ)
Statistics NZ's Integrated Data Infrastructure (IDI) is the envy of the world, but is facing some serious challenges as user demand soars.
Demand for the IDI has grown so much that its current infrastructure is struggling to cope.
That infrstructure is also key to helping Stats NZ patch together a full set of results from the deeply flawed 2018 census. One in seven kiwis failed to fully complete the census which was taken online for the first time.
Stats NZ told Parliament it needs to ensure it has a stable IDI platform because demand to load new datasets cannot be fully met, creating a backlog of requests.
"The IDI is a specialist tool designed for statisticians and researchers and currently requires a degree of specialist knowledge and skills for access and insights," Stats NZ said. "There are opportunities to support ease of access for the targeted user groups."
As a prototype, the IDI was experimental and was not fully featured, Stats NZ said. There is, however, scope to increase its processing capabilities to meet the current and future demand.
Stats NZ said it needs to scale up the IDI from an "evolved prototype" to a data warehouse model that better reflects customer demands and on a larger scale.
"Customers have signalled an interest in increasing the frequency and flexibility of IDI data uploads, whilst also accommodating an increase in datasets (number of and data volume)," Stats NZ said. "A revised data linking methodology within the IDI may support this objective."
User groups would also like Stats NZ to make it easier to work with IDI data and to gather insights through the development of more accessible projections and interfaces.
Such improvements would help Stats NZ meet its vision of “unleashing the power of data to change lives” by increasing the amount and type of insights from a wider range of researchers.
However, the investment required is above Stats NZ current operating baseline budget.
The agency has worked to make data more discoverable by developing a training dataset made from synthetic data, able to be used outside the data lab to help new users learn how to use the IDI.
Stats NZ will also publish metadata to increase the information about the datasets and their variables available to the public and researchers to improve their knowledge of the data available.
Additionally, Stats NZ has sought to make data more accessible by supporting user collaboration and working to find ways that help communities share insights, experience and resources with each other.