Agent skill

sql-queries-bigquery-google-cloud

Sub-skill of sql-queries: BigQuery (Google Cloud) (+2).

Stars 4
Forks 4

Install this agent skill to your Project

npx add-skill https://github.com/vamseeachanta/workspace-hub/tree/main/.claude/skills/_archive/data/analytics/sql-queries/bigquery-google-cloud

SKILL.md

BigQuery (Google Cloud) (+2)

BigQuery (Google Cloud)

Date/time:

sql
-- Current date/time
CURRENT_DATE(), CURRENT_TIMESTAMP()

-- Date arithmetic
DATE_ADD(date_column, INTERVAL 7 DAY)
DATE_SUB(date_column, INTERVAL 1 MONTH)
DATE_DIFF(end_date, start_date, DAY)
TIMESTAMP_DIFF(end_ts, start_ts, HOUR)

-- Truncate to period
DATE_TRUNC(created_at, MONTH)
TIMESTAMP_TRUNC(created_at, HOUR)

-- Extract parts
EXTRACT(YEAR FROM created_at)
EXTRACT(DAYOFWEEK FROM created_at)  -- 1=Sunday

-- Format
FORMAT_DATE('%Y-%m-%d', date_column)
FORMAT_TIMESTAMP('%Y-%m-%d %H:%M:%S', ts_column)

String functions:

sql
-- No ILIKE, use LOWER()
LOWER(column) LIKE '%pattern%'
REGEXP_CONTAINS(column, r'pattern')
REGEXP_EXTRACT(column, r'pattern')

-- String manipulation
SPLIT(str, delimiter)  -- returns ARRAY
ARRAY_TO_STRING(array, delimiter)

Arrays and structs:

sql
-- Array operations
ARRAY_AGG(column)
UNNEST(array_column)
ARRAY_LENGTH(array_column)
value IN UNNEST(array_column)

-- Struct access
struct_column.field_name

Performance tips:

  • Always filter on partition columns (usually date) to reduce bytes scanned
  • Use clustering for frequently filtered columns within partitions
  • Use APPROX_COUNT_DISTINCT() for large-scale cardinality estimates
  • Avoid SELECT * -- billing is per-byte scanned
  • Use DECLARE and SET for parameterized scripts
  • Preview query cost with dry run before executing large queries

Redshift (Amazon)

Date/time:

sql
-- Current date/time
CURRENT_DATE, GETDATE(), SYSDATE

-- Date arithmetic
DATEADD(day, 7, date_column)
DATEDIFF(day, start_date, end_date)

-- Truncate to period
DATE_TRUNC('month', created_at)

-- Extract parts
EXTRACT(YEAR FROM created_at)
DATE_PART('dow', created_at)

String functions:

sql
-- Case-insensitive
column ILIKE '%pattern%'
REGEXP_INSTR(column, 'pattern') > 0

-- String manipulation
SPLIT_PART(str, delimiter, position)
LISTAGG(column, ', ') WITHIN GROUP (ORDER BY column)

Performance tips:

  • Design distribution keys for collocated joins (DISTKEY)
  • Use sort keys for frequently filtered columns (SORTKEY)
  • Use EXPLAIN to check query plan
  • Avoid cross-node data movement (watch for DS_BCAST and DS_DIST)
  • ANALYZE and VACUUM regularly
  • Use late-binding views for schema flexibility

Databricks SQL

Date/time:

sql
-- Current date/time
CURRENT_DATE(), CURRENT_TIMESTAMP()

-- Date arithmetic
DATE_ADD(date_column, 7)
DATEDIFF(end_date, start_date)
ADD_MONTHS(date_column, 1)

-- Truncate to period
DATE_TRUNC('MONTH', created_at)
TRUNC(date_column, 'MM')

-- Extract parts
YEAR(created_at), MONTH(created_at)
DAYOFWEEK(created_at)

Delta Lake features:

sql
-- Time travel
SELECT * FROM my_table TIMESTAMP AS OF '2024-01-15'
SELECT * FROM my_table VERSION AS OF 42

-- Describe history
DESCRIBE HISTORY my_table

-- Merge (upsert)
MERGE INTO target USING source
ON target.id = source.id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *

Performance tips:

  • Use Delta Lake's OPTIMIZE and ZORDER for query performance
  • Leverage Photon engine for compute-intensive queries
  • Use CACHE TABLE for frequently accessed datasets
  • Partition by low-cardinality date columns

Expand your agent's capabilities with these related and highly-rated skills.

Didn't find tool you were looking for?

Be as detailed as possible for better results