Newest 'google-bigquery' Questions - Stack Overflow

Questions tagged [google-bigquery]

Google BigQuery is a Google Cloud Platform product providing serverless queries of petabyte-scale data sets using SQL. BigQuery provides multiple read-write pipelines, and enables data analytics that transform how businesses analyze data.

0
votes
1answer
6 views

How do I combine array columns into array of struct in BigQuery?

How can we construct an array of struct, in which each row in the array corresponds to several array columns having the same row number? I will show you an example of what I meant to better ...
0
votes
0answers
5 views

DSL like SQL in BigQuerOperator Airlfow

As a typed languages enthusiast I hate writing plain SQL queries. Unfortunately Airflow's BigQueryOperator does not provide any DSL support out of the box. So here comes my question - is it possible ...
0
votes
1answer
10 views

How to query Google Big Query in Apache Airflow and return results as a Pandas Dataframe?

I'm trying to save a bigquery query to a dataframe in a custom Airflow operator. I've tried using the airflow.contrib.hooks.bigquery_hook and the get_pandas_df method. The task get's stuck on ...
1
vote
1answer
18 views

How do I make a query to cast the value in a column for all partitioned tables in big query

I am curious if there is a way to query and write to all the partitioned tables in big query. I wanted to cast a single column to a different datatype and apply it to all the values across the ...
0
votes
1answer
19 views

What is the difference between a shared Saved Query and a View in BigQuery?

What is the difference between a shared Saved Query and a View in BigQuery? The documentation says that "one advantage of [a saved query] is that you can share a query that is incomplete." Is that ...
0
votes
0answers
19 views

How to export firestore sub-collection to bigquery table?

We need to export Firestore data to Bigquery for data studio reporting. We are following the below process - We export the entire firestore database to google storage bucket Through scheduled job ...
0
votes
1answer
16 views

How the pricing is calculated If I query a Bigquery View?

Say I have a BigQuery View MyView: select col1, col2, col3, col4, col5, col6, col7 from mytable; Now If I Query my view: select col1 from MyView; So In this case , will the pricing will be ...
0
votes
0answers
15 views

How do I track users from a mobile app to a website?

I have a mobile app. I also have a website. The mobile app has links that direct users to a page on the website. I want to be able to use BigQuery or Google Analytics to track the user from the mobile ...
1
vote
0answers
11 views

Firebase events dedup in Big Query - best practices?

There seems to be 1-2% of duplicates in the Firebase analytics events exported to Big Query. What are the best practices to remove these? Atm the client does not send a counter with the events (per ...
0
votes
0answers
17 views

Google Data Studio + Firebase : event value automatically imported in USD. How to get different currency?

I'm working on a Data Studio dashboard connected to a daily export of Firebase data contained in BigQuery. When initially importing the data source, the connector provides a convenient template to use ...
1
vote
0answers
32 views

How to perform a fast join in BigQuery with Apache BEAM

According to BEAM's programming guide, and according to many threads join can be achieved by CoGropByKey or KeyedPCollectionTuple (coockbook). No one is talking about the performance of these kind of ...
0
votes
0answers
12 views

How many events are lost on Firebase -> Big Query export? [on hold]

We are using Firebase analytics events export to Big Query. Seems that it can take up to 3 days for the events table in Big Query to contain all the events we can see on the Firebase analytics side. ...
0
votes
0answers
26 views

What is the equivalent statement in bigquery to set search_path?

I would like to know the equivalent statement of to set search_path to a schema in bigquery. In Redshift we use set search_path to schema_name; In SQL Server we use the statement USE Schema_name; ...
0
votes
2answers
33 views

SQL - finding identical data in the dataset by joining on the table itself

I was trying to write a sql query to find out the following - User_info column1 column2 userId1 pete katie katie pete ...
1
vote
0answers
20 views

How to have a single date field or write this more efficiently

We wrote a query that calculates the amount of time it takes to route an email from Gmail, to a third party security service, then return to Gmail. Now we want to graph it in DataStudio, but the way ...
0
votes
1answer
20 views

Get list of tables in bigquery dataset using python and bigquery API

How can I query a Bigquery dataset and get a list of all the tables in the dataset? As far as I know, I can only use the Bigquery API, but I cannot authenticate, despite passing an API key. url = ...
0
votes
0answers
21 views

How the initial kmeans points works in to BigQuery ML?

I'm using the BigQuery for machine-learning, more specifically the kmeans method for an unlabeled data where I'm trying to find clusters. I'd like to know if someone has discovered how the BQ ML ...
0
votes
1answer
31 views

BigQuery: join a firestore collection with its subcollection

I imported my data from firestore and I have a collection users with a subcollection profiles. The key of the users can be found in matchingUsers.__key__.name (e.g. "USER_KEY"), while the __key__....
0
votes
0answers
21 views

BigQuery: Keep Diacritics

I'm working through a data set where there exist a certain number of entries with diacritics and I would like to know if there is a way to keep the diacritics on BigQuery and not have them turned into ...
0
votes
1answer
36 views

Insert data into BigQuery from a Google Script : Encountered “”

I am trying to import data from a Google Spreadsheet to BigQuery, via Google App Script. I can download data, but I have an error when I try to do INSERT INTO. The error message is Encountered "" ...
0
votes
0answers
26 views

BigQuery exports NUMERIC data type as binary data type in AVRO

I am exporting the data from BigQuery table which has column named prop12 defined as NUMERIC data type. Please note that destination format is AVRO and can't be changed. bq extract --...
-2
votes
0answers
22 views

How to make big query load fast on rails

cloud-bigquery gem with rails but for some reason the query is loading really slow. On bigquery console it takes about a sec to load the query but using the google cloud bigquery gem takes about 5-...
1
vote
1answer
30 views

Unique row number per date with partition

I have data containing product codes and dates that data was updated, I'd like to number the rows per product code, so that I can select only the most recent update. I tried the following code: ...
0
votes
0answers
17 views

Ingest data from Salesforce to BigQuery: Based on a variable

I'm relatively new to the Data Engineering side of things and I'm facing a rather specific obstacle for an exercise. The underlying question is to find a direct way to stream data from a platform like ...
2
votes
3answers
34 views

How can I extract the number of occurences where a value occurs as the MAX?

I have a summary table as below user_id service no_of_trx 1 A 56 1 C 43 1 B 22 2 C 10 2 A 3 ...
0
votes
0answers
30 views

how to fix “field units already exist in schema” for pandas gpq

Versions: Mac OS Mojave 10.14.5 Python 3.6.5 Pandas 0.24.2 pandas-gbq 0.10.0 I am trying to pull data from the shipstation api and load it into bigquery to use in our BI platform (tableau). I have ...
1
vote
0answers
20 views

Load partitioned (spark) parquet to a bigquery table

I have data written out from spark, to parquet files in gcs, partitioned on a date column. The data in gcs look like this: gs://mybucket/dataset/fileDate=2019-06-17/000.parquet gs://mybucket/dataset/...
0
votes
1answer
32 views

BigQuery - scheduled query through CLI

Simple question regarding bq cli tool. I am fairly confident the answer is, as of the writing of this question, no, but may be wrong. Is it possible to create a scheduled query (similar to seen in ...
0
votes
1answer
18 views

How to include table name in result for wildcard query?

I'm querying from a bunch of tables in bigquery using a wildcard query. I'd like each result row to show which table it's from. I've tried to include _TABLE_SUFFIX in the select, but it won't compile:...
0
votes
1answer
57 views

7-day user count: Big-Query self-join to get date range and count?

My Google Firebase event data is integrated to BigQuery and I'm trying to fetch from here one of the info that Firebase gives me automatically: 1-day, 7-day, 28-day user count. 1-day count is quite ...
0
votes
0answers
25 views

How can I insert predefined value in a particular column of big query table?

I want to write one small python utility job which will load data from 2 type of files(say one is "SourceA", another is "SourceB") present in GCS bucket. Both csv files have columns in same numbers, ...
0
votes
1answer
32 views

How to merge Google Ads and Firebase data in BigQuery [on hold]

I want to report whole Google Ads campaigns, but there is problem: I can't report data from Google Ads because 100 events in firebase = 120-150 events in Google Ads (google support didn't help). So ...
0
votes
0answers
24 views

How to calculate session length and no of sessions in Firebase raw data?

I have linked Firebase project with BigQuery. How to calculate session length and number of sessions from the data. Is there any column or event_name similar to '.session_start / .session_stop ' which ...
0
votes
2answers
36 views

How to group user sessions by converted row

I'm doing simple multichannel attribution exploration and got stuck with grouping user sessions. For example, I have simple sessions table: client channel time converted 1 social 1 0 1 cpc ...
0
votes
2answers
31 views

avoid duplication when joining tables without unique id using foreign keys

I'm facing this issue where I don't really know how to handle duplicate rows when joining two tables. I have two tables I'd like to join Value_x table : ID Campaign Value_x foo ...
0
votes
0answers
18 views

Not able to connect to BigQuery using Simba JDBC driver 42_1.2.0.1000

Encountering the following exception when trying to connect to BigQuery using JDBC Simba driver version 42_1.2.0.1000 Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base....
0
votes
1answer
22 views

How to UNNEST and write ALL repeated rows into single line

Currently, I have a nested table with product name and ingredients. But need to UNNEST() it and write all ingredients in a single line. SELECT productTitle, ingridientTitle FROM `TABLE`, UNNEST(...
0
votes
1answer
30 views

Apache Beam Saving to BigQuery using Scio and explicitly specifying TriggeringFrequency

I'm using Spotify Scio to create a scala Dataflow pipeline which is triggered by a Pub/Sub message. It reads from our private DB and then inserts information into BigQuery. The problem is: I need to ...
-1
votes
1answer
41 views

What is the generic term for technologies like AWS Athena (Presto) and GCP BigQuery?

From a user perspective, Athena and BigQuery both accept a sql-like query, they both query files stored on disk (without needing to have set up a relational database), and they both return results (...
0
votes
4answers
47 views

How extract only alphanumeric characters from string? (SQL Google BigQuery)

Say I have a column called merchants containing these values: Al's Coffee Belinda & Mark Bakery Noodle Shop 38 How can i get it to extract: alscoffee belindamarkbakery ...
0
votes
2answers
26 views

BigQuery: IN a temporary table

I have a long SQL query looking like this: SELECT AVG(total_sum) AS avg_total_sum, COUNT(*) AS cnt FROM ( SELECT order_id, ... FROM `project.dataset.orders` WHERE order_id NOT IN ( SELECT ...
0
votes
1answer
34 views

Big Query can't query some csvs in Cloud Storage bucket

I created a permanent Big Query table that reads some csv files from a Cloud Storage Bucket sharing the same prefix name (filename*.csv) and the same schema. There are some csvs anyway that make fail ...
0
votes
1answer
34 views

BigQuery SQL: Execute query over rolling time window

Using BigQuery and Standard SQL I am trying to calculate the retention rate for users seen in one period, compared to users seen in period after. I want to calculate this daily, using the same period ...
0
votes
0answers
31 views

Get shared dataset with Api

I am trying to get a list of all the shared datasets that shared with me in Google BigQuery with API I try using bigquery.projects.list but the API only bring me my only projects. How can I get all ...
0
votes
0answers
38 views

How to call BigQuery from server side Swift?

I have iOS App written in Swift. Would like to implement a web service in server side Swift to be called from the iOS App. Would use Kitura installed as Docker App on Google Cloud Run. The web ...
1
vote
1answer
32 views

BigQuery connector ClassNotFoundException in PySpark on Dataproc

I'm trying to run a script in PySpark, using Dataproc. The script is kind of a merge between this example and what I need to do, as I wanted to check if everything works. Obviously, it doesn't. The ...
0
votes
0answers
26 views

Scheduled query not appending any rows

Ihave scheduled query to refresh an existing BQ table. BQ says the job runs, and confirms the time it finished. However, the rows never actually get appended. There are no sorts of errors or ...
0
votes
0answers
48 views

Dataprep flow from csv in Cloud Storage to Big Query table incomplete (not all records loaded)

I set up a Dataprep scheduled job flow copying and treating daily some csv and json files stored in a Cloud Storage bucket to Big Query tables. It was working fine, but since some days the job ...
0
votes
0answers
67 views

How to utilize API call to overwrite/update Google Sheet?

My current API request will post a new Google Sheet every time it runs, but I just want it to either overwrite the existing sheet or just update it with new data. I've tried using different request ...
0
votes
3answers
48 views

LIMIT per group - Google BigQuery/Standard SQL

I have a table like the following (example here): CREATE TABLE topics ( name varchar(64), url varchar(253), statistic integer, pubdate timestamp ); INSERT INTO topics VALUES ('a', 'b', 100,...