How can i load my csv from google dataLab to a pandas data frame?
Here is what i tried: (ipython notebook, with python2.7) import gcp import gcp.storage as storage import gcp.bigquery as bq import matplotlib.pyplot as plt import pandas as pd import numpy as np sample_bucket_name = gcp.Context.default().project_id + '-datalab' sample_bucket_path = 'gs://' + sample_bucket_name sample_bucket_object = sample_bucket_path + '/myFile.csv' sample_bucket = storage.Bucket(sample_bucket_name) df = bq.Query(sample_bucket_object).to_dataframe() Which fails. would you have any leads what i am doing wrong ?
Based on the datalab source code bq.Query() is primarily used to execute BigQuery SQL queries. In in terms of reading a file from Google Cloud Storage (GCS), one potential solution is to use the datalab %storage line magic function to read the csv from GCS into a local variable. Once you have the data in a variable, you can then use the pd.read_csv() function to convert the csv formatted data into a pandas DataFrame. The following should work: import pandas as pd from StringIO import StringIO # Read csv file from GCS into a variable %storage read --object gs://cloud-datalab-samples/cars.csv --variable cars # Store in a pandas dataframe df = pd.read_csv(StringIO(cars)) There is also a related stackoverflow question at the following link: Reading in a file with Google datalab
Datalab “deploy” fails
Datalab: moving notebook to other folder
Unable to run a job in datalab
Why does the Table metadata show zero rows when querying the table returns rows?
Unable to deploy Google Cloud Datalab on an existing project due to RPC failure
google cloud datalab: reload set from bigquery leads to RespondNotReady
datalab gcp package vs gcloud
No controls visible in Google Cloud Datalab?
Not able to deploy Google Cloud Datalab
Best way to get static files in/out of datalab?
How to run google-cloud-datalab on my local linux server?
Missing option to connect to Bitbucket
Charts drawn with %%chart command do not render
Deploying Google Cloud Datalab fails with “Not enough VMs ready”
Can not deploy Google Cloud Datalab - no logs