How to create BigQuery tables on top of external data?

Abhishek Raviprasad
1 min readNov 2, 2022

--

In this article let’s see how we can create external Big query table on top of the files present in the google storage bucket using python client.

Script location: https://github.com/abhr1994/Personal_Scripts/tree/main/BigqueryTables

We would need google-cloud-bigquery library installed before running the script. This can be installed by running below command,

pip install google-cloud-bigquery

There are 2 ways to create external Big Query tables on the existing data files,

First Method

Create external table using CREATE OR REPLACE EXTERNAL TABLE query. In this case the tables are created on top of the data residing in the google bucket. Data is still in original format and not BigQuery proprietary format.

The source file formats can be one of Parquet, CSV, JSON, ORC and the schema is inferred automatically.

Second Method

Load the external data file contents and create a new external BQ table. This method loads the data from source path into BigQuery.

--

--

Abhishek Raviprasad

Senior Solution Engineer at Infoworks.io, 4+ years of big data/ETL data warehouse experience building data pipelines