How can i load my csv from google dataLab to a pandas data frame?
Here is what i tried: (ipython notebook, with python2.7) import gcp import gcp.storage as storage import gcp.bigquery as bq import matplotlib.pyplot as plt import pandas as pd import numpy as np sample_bucket_name = gcp.Context.default().project_id + '-datalab' sample_bucket_path = 'gs://' + sample_bucket_name sample_bucket_object = sample_bucket_path + '/myFile.csv' sample_bucket = storage.Bucket(sample_bucket_name) df = bq.Query(sample_bucket_object).to_dataframe() Which fails. would you have any leads what i am doing wrong ?
Based on the datalab source code bq.Query() is primarily used to execute BigQuery SQL queries. In in terms of reading a file from Google Cloud Storage (GCS), one potential solution is to use the datalab %storage line magic function to read the csv from GCS into a local variable. Once you have the data in a variable, you can then use the pd.read_csv() function to convert the csv formatted data into a pandas DataFrame. The following should work: import pandas as pd from StringIO import StringIO # Read csv file from GCS into a variable %storage read --object gs://cloud-datalab-samples/cars.csv --variable cars # Store in a pandas dataframe df = pd.read_csv(StringIO(cars)) There is also a related stackoverflow question at the following link: Reading in a file with Google datalab
Fastest way to read big amounts of data in Google Datalab?
Datalabs project not found
Datalab front get stuck when opening notebooks, and buttons aren't working
How can I programmatically give Cloud ML access to a bucket?
Running cloud datalab kernel on my own server?
How can I get the Cloud ML service account programmatically in Python?
Google datalab : how to import pickle
Is text the only content type for %%storage magic function in datalab
Do I need to update gcloud components as indicated in docker startup
How to import user-written custom modules in google datalab?
Access to Google Cloud Datalab Fails with ssh Error
Cloud Datalab permissions - 403 on VM URL when sharing access
Do I need to manually specify the project on docker Datalab?
DATALAB does not start correctly
Datalab Notebook Answer y/N in prompt
Error on deploy