google-app-engine


How are Long ids used in Google Datastore insert/update queries?


Our product is using Google Datastore as the application database. Most of the entities use IDs of type Long and some of type String. I noticed that the IDs of type Long are not in consecutive order.
Now we are exporting some big tables, with around 30 - 40 million entries, to json files for some business purposes. Initially we expected that a simple query like "ofy().load().type(ENTITY.class).startAt(cursor).limit(BATCH_LIMIT).iterator()" will help us iterate through the entire content of that specific table, starting from the first entry and ending with the most recently created one. We are working in batches and storing the cursor after every batch, so that the next task can load the batch and resume.
But after noticing that an entity created some minutes ago can have an ID smaller than the ID of another entity created 1 week ago, we are wondering if we should consider a content freeze during this export period. On one hand it's critical to make a good export and not to miss older data up to a specific date, on the other hand a content freeze longer than 1 day is a problem for our customers.
What do you advice us to do?
Thanks,
Cristian.
I do not think you need to worry about uniqueness of your id. Datastore build on top of Bigtable with 6 tables.
first table stores entities
second stores entities by kind
third stores indexes for the property values in the ascending order
fourth to store indexes for the property values in the descending order
fifth stores indexes for multiple properties together
sixth keeps a track of the next unique ID for Kind
Format is something like this.
[application ID]-[namespace]-[Kind]-[ID]
It is garanties of uniqueness each entities.
Yes, the format on that table is [Application ID]-[Kind Name] and the value is the next value. Let say you have kind products and that table will look like this |key(yourapp-products), Next ID(3)|. Now you created new entity for kind products it will be assigned to ID(3) and the row on that table will get new value |key(yourapp-products), Next ID(4)|. Also to mention that table has only one row since we have only one kind products.
Do you specify ID yourself or let datastore generate itself? It sounds like you have "Pre-allocating IDs" issue, just speculating but for every batch you need sort Kind.allocate_ids(size=blah) that way you can keep sequence.

Related Links

Google App Engine - Retrieving Parameters from URI in a doGet()?
How to detect if entity has no attribute (not model) in Google App Engine
GAE synchronized data with instances
How to increase GAE datastore index quota?
ndb independent transactions and retry
NDB querying a GenericProperty in repeated Expando StructuredProperty
objectify filter empty values
GAE error (<stderr>: SystemId Failed
Backend instance at custom domain
Manually add entity to empty Google App Engine DataStore
Google App Engine Log - ms and cpu_ms [duplicate]
unable to deploy after upgrading to 1.7.4
Model.get(list_of_keys) in a transaction, filtering a single entity group instead of throwing BadRequestError
Google app engine and JPA with Eclipse plug-in
Many-To-Many Relationships in Google App Engine Datastore (ndb)
Is it possible to run Google Omaha on Google App Engine?

Categories

HOME
django
beautifulsoup
search
paraview
powerpoint
camera
sitecore
spring-data-mongodb
rds
emscripten
maxscript
crystal-reports-2013
flume
spin
okhttp
g++
startup
gruntfile
azure-powershell
capistrano3
scsi
esoteric-languages
null-pointer
dimensional-modeling
fileserver
selinux
articulate-storyline
postscript
roundup
strongname
jboss6.x
dapper-extensions
fat
fragment-backstack
fractions
firebase-analytics
android-things
turfjs
mailkit
gojs
gnat-gps
activexobject
xajax
apiary.io
wurfl
zoomify
gdata
urlsession
hackintosh
apollostack
file-sharing
dapper-simplecrud
x++
ntvs
opennms
netbeans6.8
launchctl
lightning
amq
headless
avx
boo
er-diagram
baidu
shinobi
rsa-archer-grc
node-request
rspec2
dimension
omniauth-facebook
invoke-sqlcmd
libvlc
morton-number
canalyzer
slash
sony-lifelog-api
typeconverter
matlab-deployment
mruby
rmysql
live555
terminate
vsx
renderer
conditionaltagsupport
gpars
pclzip
font-awesome-4.0.0
android-icons
alice
robotlegs
qi4j
sentestingkit
jquery-blockui
oracle-enterprise-linux
azman
koken
infobright
saxparseexception
update-statement
lr
massive
timthumb
xui
svk
ie-compatibility-mode
jqueryform
aspmenu
cldc
dojo-dnd
scrubyt
soappy
xslcompiledtransform
mathematical-typesetting

Resources

Encrypt Message