How are Long ids used in Google Datastore insert/update queries?
Our product is using Google Datastore as the application database. Most of the entities use IDs of type Long and some of type String. I noticed that the IDs of type Long are not in consecutive order. Now we are exporting some big tables, with around 30 - 40 million entries, to json files for some business purposes. Initially we expected that a simple query like "ofy().load().type(ENTITY.class).startAt(cursor).limit(BATCH_LIMIT).iterator()" will help us iterate through the entire content of that specific table, starting from the first entry and ending with the most recently created one. We are working in batches and storing the cursor after every batch, so that the next task can load the batch and resume. But after noticing that an entity created some minutes ago can have an ID smaller than the ID of another entity created 1 week ago, we are wondering if we should consider a content freeze during this export period. On one hand it's critical to make a good export and not to miss older data up to a specific date, on the other hand a content freeze longer than 1 day is a problem for our customers. What do you advice us to do? Thanks, Cristian.
I do not think you need to worry about uniqueness of your id. Datastore build on top of Bigtable with 6 tables. first table stores entities second stores entities by kind third stores indexes for the property values in the ascending order fourth to store indexes for the property values in the descending order fifth stores indexes for multiple properties together sixth keeps a track of the next unique ID for Kind Format is something like this. [application ID]-[namespace]-[Kind]-[ID] It is garanties of uniqueness each entities. Yes, the format on that table is [Application ID]-[Kind Name] and the value is the next value. Let say you have kind products and that table will look like this |key(yourapp-products), Next ID(3)|. Now you created new entity for kind products it will be assigned to ID(3) and the row on that table will get new value |key(yourapp-products), Next ID(4)|. Also to mention that table has only one row since we have only one kind products. Do you specify ID yourself or let datastore generate itself? It sounds like you have "Pre-allocating IDs" issue, just speculating but for every batch you need sort Kind.allocate_ids(size=blah) that way you can keep sequence.
Google App Engine - Retrieving Parameters from URI in a doGet()?
How to detect if entity has no attribute (not model) in Google App Engine
GAE synchronized data with instances
How to increase GAE datastore index quota?
ndb independent transactions and retry
NDB querying a GenericProperty in repeated Expando StructuredProperty
objectify filter empty values
GAE error (<stderr>: SystemId Failed
Backend instance at custom domain
Manually add entity to empty Google App Engine DataStore
Google App Engine Log - ms and cpu_ms [duplicate]
unable to deploy after upgrading to 1.7.4
Model.get(list_of_keys) in a transaction, filtering a single entity group instead of throwing BadRequestError
Google app engine and JPA with Eclipse plug-in
Many-To-Many Relationships in Google App Engine Datastore (ndb)
Is it possible to run Google Omaha on Google App Engine?