Forums Active Intelligence Server Google BigQuery connection

This topic contains 7 replies, has 4 voices, and was last updated by  jit.mehta 9 months, 2 weeks ago.

Viewing 8 posts - 1 through 8 (of 8 total)
  • Author
    Posts
  • #339

    jroeterd
    Participant

    Hello,

    I am trying to connect to Google BigQuery (I just started with that). In the settings.properties file I see an example for bigquery as follow:

    #external.db.4=bigquery,rosy-ratio-132710,100

    Assuming that ‘rosy-ratio-132710’ is the project you are pointing to what is the 100? And how can you add username/password or json key file.

    Thanks for any answer.

    Regards,
    Jeroen

    #341

    debbie_harry
    Participant

    Hi Jeroen,

    Thank you for your query, yes you are correct you would add the connection properties in settings.properties file: below is the explanation:

    external.db.4=bigquery,rosy-ratio-132710,100

    Project Name: rosy-ratio-132710
    Wait Time : 100 in milliseconds

    Usually properties like keys, project ID are stored in the json file which will be downloaded when you create a key, to create a key for your project follow the steps:

    Log on to your console:

    https://console.developers.google.com/iam-admin/

    Go to Service Account Tab from the left menu bar

    Whilst in the Service Account Tab on the right hand side click on ellipsis to expand the options, you will find option to create a key

    Once the key is generated is will automatically download the Json with all the properties.

    You are also require to create Environment system variable with the following name:

    GOOGLE_APPLICATION_CREDENTIALS
    And assign file location in variable name (The json file which you have downloaded)

    Once the above steps have been followed, you can now update your settings.properties file.

    I hope the above will help you to connect the Google big query: you can get step by step help for google big queries from the following link:

    https://cloud.google.com/bigquery/docs/how-to

    BR, Debbie

    #344

    jroeterd
    Participant

    Hello Debbie,

    A very late followup.
    I succeeded in making a connection to Google BigQuery as you described. When I create a query on a table with 12 columns and about 40k records I see the correct columnnames in the output columns tab of the dataset. So the connection is there.

    But the data is not getting there! When add a limit 100 to the query and try to preview it it shows the loading… text for ever (or at least the 15 minutes I tried). In another tool we are evaluating the same query takes 2 seconds.

    I tried looking in the logs but nothing to see.

    In the apache logs directory there is a localhost_access_log with the date that shows one entry at the time I was runnng the preview.

    0:0:0:0:0:0:0:1 – – [20/Sep/2017:15:11:08 +0200] “POST /ais/rest/dataset/preview HTTP/1.1” 500 5213

    Mayby the 500 or 5213 mean something.

    Any help is appreciated.

    Kind regards,
    Jeroen

    #345

    joe_dallesandro
    Participant

    Hi Jeroen,

    Thank you for your query, we are really sorry that you are facing issue when connecting with Google big query, what I can figure out from Error 500 is that it isn’t getting any data this could be connection issue, however if you can pull data from other tool/system I do not see this as a connection issue.
    We have tried with various dataset with 50k+ records and all seems working fine. I would suggest that you take our Free live technical support which will enable our technical team to provide you remote assistance live and help you to overcome the problem you are currently facing extracting the data. This will enable our technical team to run some troubleshoot and find the error point.
    Kindly drop email at support@activeintelligence.co.uk with your availability and they will arrange everything for you.

    Thanks
    Joe Dallesandro

    #346

    jroeterd
    Participant

    Hello Joe,

    I experimented a little further. In bigquery I have a json field (type string) with contains a json string.

    I create a view in which I select the fields from json that I want to use. Using the following SQL :

    json_extract_scalar(json, ‘$.category.name’) as categoryName

    All fields work till this one. Then it says “no records to show”.

    I also see some timestamp fields in de generated view that stay empty (although the other data is shown). U generated them using :

    timestamp(substr(json_extract_scalar(json, ‘$.callDate’),0,22) ) as callDate,

    I will investigate further.

    Jeroen

    #347

    joe_dallesandro
    Participant

    Hi Jeroen,

    Could you please share a sample data. so that i will try at my end.

    #404

    jroeterd
    Participant

    Hello, Finally a follow up on this …

    I saw the new AIS 2.1 on your site and installed it. Added BigQuery datasource from the GUI (thanks for that) and tried again to get my data out of BigQuery.

    But then I realized I was querying a view that was created with BigQuery’s standardSQL and the default SQL AIS is expecting (as is BigQuery itself) is legacySQL.

    So by adding #standardSQL before my actual query I get the results I expected. My dates are also shown in the correct way. So AIS was doing it right all along :-).

    But I do have found another issue I would like to present to you.

    When I make a Query in BigQuery Query Editor (in the browser), using this query on the Public Datasets > nyc-tcl:green dataset.

    #standardSQL
    select pickup_datetime from nyc-tlc.green.trips_2014
    limit 10000

    Then the timestamp fields are shown as:
    2014-03-27 10:40:00.000 UTC
    2014-01-08 06:45:20.000 UTC

    When I copy the same query to AIS and run it the field is also recognized as a timestamp field but the results are shown like this:
    1.3959168E9
    1.38916352E9

    changing the query to:
    #standardSQL
    select datetime(pickup_datetime) from nyc-tlc.green.trips_2014
    limit 10000

    fixes this but it would be nice when timestamp fields return a human readable format like the query does when run in BigQuery itself.

    I tend to write the queries in the BQ Query Editor (color-coded, auto-complete, ect.) and copy then to my visualization tool of choice.

    If this is the desired behaviour I respect that but as it is known that the field is of a type timestamp it seams strange to me to present it as a float?

    Kind regards,
    Jeroen

    #405

    jit.mehta
    Participant

    Hi Jeroen,

    You can describe TIMESTAMP data types as either UNIX timestamps or calendar datetimes. BigQuery stores TIMESTAMP data internally as a UNIX timestamp with microsecond precision. check here for more information https://cloud.google.com/bigquery/data-types

    Currently in AIS it support legacy query and standard SQL query support should be added shortly. If this is blocking just send an email to support@activeintelligence.co.uk and they should provide you patch or you can use the workaround like you did or format timestamp using function FORMAT_UTC_USEC.

    Kind Regards,
    Jitendra

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic.