google-cloud-datalab


How can i load my csv from google dataLab to a pandas data frame?


Here is what i tried:
(ipython notebook, with python2.7)
import gcp
import gcp.storage as storage
import gcp.bigquery as bq
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
sample_bucket_name = gcp.Context.default().project_id + '-datalab'
sample_bucket_path = 'gs://' + sample_bucket_name
sample_bucket_object = sample_bucket_path + '/myFile.csv'
sample_bucket = storage.Bucket(sample_bucket_name)
df = bq.Query(sample_bucket_object).to_dataframe()
Which fails.
would you have any leads what i am doing wrong ?
Based on the datalab source code bq.Query() is primarily used to execute BigQuery SQL queries. In in terms of reading a file from Google Cloud Storage (GCS), one potential solution is to use the datalab %storage line magic function to read the csv from GCS into a local variable. Once you have the data in a variable, you can then use the pd.read_csv() function to convert the csv formatted data into a pandas DataFrame. The following should work:
import pandas as pd
from StringIO import StringIO
# Read csv file from GCS into a variable
%storage read --object gs://cloud-datalab-samples/cars.csv --variable cars
# Store in a pandas dataframe
df = pd.read_csv(StringIO(cars))
There is also a related stackoverflow question at the following link:
Reading in a file with Google datalab

Related Links

Datalab “deploy” fails
Datalab: moving notebook to other folder
Unable to run a job in datalab
How to link chart and javascript cell to have the same chart object as reference
Why does the Table metadata show zero rows when querying the table returns rows?
Unable to deploy Google Cloud Datalab on an existing project due to RPC failure
google cloud datalab: reload set from bigquery leads to RespondNotReady
datalab gcp package vs gcloud
No controls visible in Google Cloud Datalab?
Not able to deploy Google Cloud Datalab
Best way to get static files in/out of datalab?
How to run google-cloud-datalab on my local linux server?
Missing option to connect to Bitbucket
Charts drawn with %%chart command do not render
Deploying Google Cloud Datalab fails with “Not enough VMs ready”
Can not deploy Google Cloud Datalab - no logs

Categories

HOME
azure-stream-analytics
localization
cobalt
delphi-7
time-complexity
pda
angular-formly
themes
static-site
limit
banner
text-mining
cmdb
onedrive-api
dynatrace
flatpak
screen-readers
searchview
html5-appcache
digital-signature
simpleitk
user-experience
backpack-for-laravel
mongoid6
froala
expressionengine3
syntax-highlighting
jasmine-node
asymptotic-complexity
intersystems-cache
datastax-enterprise-graph
failover
twiml
jmockit
angularjs-material
jett
fiware-cygnus
raft
hivemq
dm-script
node-mssql
attributeerror
smoothstate.js
contract
vaadin-charts
corruption
backbone-collections
arules
gpg-signature
docpad
flex-monkey
neon
spreadjs
ascii-art
sapi
assertion
fiware-monitoring
zen-cart
viola-jones
teamcity-9.1
zend-currency
umlgraph
gamekit
boost-program-options
viewer
rickshaw
lambda-architecture
winginx
gae-quotas
git-repo
sbcl
spring-3
mt4j
wp7test
windows-98
redis-py
patricia-trie
inkcanvas
database-create
ilasm
ikimagebrowserview
tournament
nhprof
squishit
instantiationexception
silent
jqueryform
hbm2java
addchild
google-wave
idatareader

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App