Cookies on ons.gov.uk

Cookies are small files stored on your device when you visit a website. We use some essential cookies to make this website work.

We would like to set additional cookies to remember your settings and understand how you use the site. This helps us to improve our services.

You have accepted all additional cookies. You have rejected all additional cookies. You can change your cookie preferences at any time.

Skip to main content

How to access : Using our technology

Overview

The Integrated Data Service (IDS) is a cloud-based data platform. This means it stores data online and provides the tools to analyse it.  

You do not need to be an expert with using our technology to access the IDS. Guidance and tutorials are provided for new users. However, you will need some coding skills to run a successful project, so some training on Structured Query Language (SQL) and R or Python ahead of time would be beneficial. 

The IDS uses different technologies that work through Google Cloud to provide its service. As the IDS evolves and responds to the needs of our users, we may introduce new tools to help analysts and researchers work with their data efficiently. 

Once your account has been approved and you are a member of a project, you will have access to the following tools. 

Your workspace

This is the analytical area where you can write code to access data, perform analyses and create graphs, tables or other work outputs.  

For each project you are a member of, you will have one or more of the following workspaces: 

  • Vertex AI Workbench – a JupyterLab environment that supports Python
  • R Studio – a virtual version of the RStudio environment supporting R
  • CodeOSS – a virtual version of CodeOSS, like VSCode, supporting Python 

You can mix your code with formatted text to explain your project using documents called Notebooks (opens in a new tab)  in the Vertex AI Workbench, or R Notebooks or R Markdown files in R Studio. 

There is more user guidance on how to use your workspace after you have been approved as a member of an IDS project.

Remote browsing

Remote browsing is a security measure used by the IDS. For this, we use Cloudflare Remote Browser Isolation (RBI). In practice, this will show a small blue address bar at the top of your internet browser.  

This should happen without any user actions required, but there are some points to note: 

  • IDS features will only open within the remote browser
  • other websites will not open within the remote browser
  • copy and pasting into or out of the remote browser is disabled
  • copy and pasting between remote browser windows is allowed, but the clipboard for the remote browser currently needs to be activated by copying some text outside of the remote browser first
  • the remote browser may ask you to sign in a second time 

Read more about the steps your organisation must take to set up remote browsing

Storing data

The IDS stores its data in a central database that runs on BigQuery, a data storage platform by Google (opens in a new tab)  

We hold a range of de-identified datasets available to view through our data catalogue (opens in a new tab) 

However, accredited researchers only have access to de-identified data that has been approved for a project they are a member of. The metadata (that is, the data about the data) can be viewed using BigQuery Studio.  

You can access it directly through the IDS Hub to inspect your data or interact with BigQuery through code in your Notebook to withdraw and store data.  

The IDS database uses Structured Query Language (SQL) as the standard way to interact. Guidance on this, as well as tutorials and example code are available inside the IDS. 

Interacting with data

To interact with BigQuery, you will need to use Structured Query Language (SQL), a coding language used mainly when interacting with databases.  

Bits of SQL code are often called “queries”, as their purpose is to query a database. As part of these queries, data can already be manipulated, for example, aggregated or filtered. The output seen by the user is then already processed data which can save memory space in the work environment.  

You will need some skills in SQL coding to effectively access, process and link your data. 

Find out about the training available for using SQL 

Analysing data

The coding tools currently available for analysing data are R and Python. 

Both are open source, free programming languages with plenty of free resources for learning to code in these languages.  

You should be comfortable with using these to: 

  • manipulate data
  • perform analysis using code
  • create basic data visualisations 

Find out about the training available for using Python and R 

Storing code

Code is stored and shared on GitHub, a platform for storing code while maintaining version control (opens in a new tab)  

The IDS has a dedicated, private GitHub server, that allows complex collaboration as well as backups of any work stored on it. It supports a folder structure and is the only permanent storage area for user code on the IDS. 

When a user clones or creates a repository, any changes made to files in their local copy of the repository will be tracked. Every time a user pushes code to their repository, the changes are saved as a history of changes, so that old versions of the code can be brought back. It also allows different people to work on the same code at the same time and merge their changes later.