INNER JOIN two BigQuery tables based on common timestamps within a range of each other? -
i have 2 table in bigquery. 1 has bunch of data based around timestamp. second set of data has feature add first set, timestamp. however, timestamps aren't same. however, know within 30 seconds of each other.
i thinking join... on abs(timestamp1 - timestamp2) < 30, that's not working.
you can cross join. not efficient, work if tables relatively small. syntax (also using standard sql, see mikhail's link how enable it):
select ts1, x, ts1, y maintable cross join secondtable abs(ts1 - ts2) < 30
for large tables, might have more elaborate, bucket both sides minute, , equality join. you'll need support case when join crosses neighbor bucket, like:
select ts1, x, ts2, y maintable join (select *, round(ts2/30) bucket secondtable union select *, round(ts2/30-1) bucket secondtable union select *, round(ts2/30+1) bucket secondtable) on round(ts1/30) = bucket abs(ts1-ts2) < 30
if there more 1 match , need select best one, like
select * ( select *, min(abs(ts1-ts2)) over(partition ts1) min_diff ( <previous query> )) abs(ts1-ts2) = min_diff;
Comments
Post a Comment