How to create BigQuery tables on top of external data?

Abhishek Raviprasad
1 min readNov 2, 2022

--

In this article let’s see how we can create external Big query table on top of the files present in the google storage bucket using python client.

Script location: https://github.com/abhr1994/Personal_Scripts/tree/main/BigqueryTables

We would need google-cloud-bigquery library installed before running the script. This can be installed by running below command,

pip install google-cloud-bigquery

There are 2 ways to create external Big Query tables on the existing data files,

First Method

Create external table using CREATE OR REPLACE EXTERNAL TABLE query. In this case the tables are created on top of the data residing in the google bucket. Data is still in original format and not BigQuery proprietary format.

The source file formats can be one of Parquet, CSV, JSON, ORC and the schema is inferred automatically.

Second Method

Load the external data file contents and create a new external BQ table. This method loads the data from source path into BigQuery.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Abhishek Raviprasad
Abhishek Raviprasad

Written by Abhishek Raviprasad

Senior Solution Engineer at Infoworks.io, 4+ years of big data/ETL data warehouse experience building data pipelines

No responses yet

Write a response