Querying your data sources
Data sources are either external tables (see CREATE EXTERNAL TABLE) or external database (see CREATE EXTERNAL DATABASE).
- Listing data sources
- Qualifying data sources
- Querying multiple data sources
- Querying data sources with GlareDB Cloud
Listing data sources
Listing data sources connected to your deployment can be done with the following SQL query:
SELECT datasource, name, object_type
FROM glare_catalog.external_datasources;
datasource | name | object_type
------------+-------+-------------
postgres | users | table
bigquery | bq | database
mongo | atlas | database
postgres | qa | database
(4 rows)
Qualifying data sources
The object_type
column indicates whether the data source is an external table or an external database. In this example, there’s three external databases and one external table.
When a data source is an external table, it may be queried with just its name. For example, querying the data source named users
:
select * from users;
When a data source is an external database, a fully qualified name must be used to reference the table of interest. For example, querying an account_signup
table inside a events
schema in the bq
data source:
select * from bq.events.account_signup;
Querying multiple data sources
The real power of GlareDB is being able to work with multiple data sources in a single query. Let’s work through an example using a BigQuery and Postgres data source.
select datasource, name, object_type
from glare_catalog.external_datasources;
datasource | name | object_type
------------+-------+-------------
bigquery | bq | database
postgres | prod | database
(2 rows)
Internally at GlareDB, we want to be able be able to answer the question “How many GlareDB deployments were created per organization after the organization has upgraded their billing plan?”
Our production Postgres holds information about deployments and organizations, and our BigQuery instance stores all events about when things happened. So to answer this question, we can join a table from our BigQuery data source onto tables inside our Postgres data source:
select count(*) as num_deployments, o.id
from bq.events.billing_plan_upgraded as b
inner join prod.public.organizations as o on b.org_id = o.id
inner join prod.public.deployments as d on o.id = d.org_id
where d.created_at > b.timestamp
group by o.id
Notice we’re able reference data sources everywhere you would normally reference tables in a typically SQL database.
Querying data sources with GlareDB Cloud
GlareDB Cloud has an integrated SQL workspace. For more information refer to Querying your data sources in GlareDB Cloud.