kudu


Filtering a specific row in kudu using kudu scanner


The target table in kudu is huge. I have the following in scala and I would like to check if the row exists in kudu. These four columns are primary keys in kudu table but when I define a upper bound I seem to get all the rows.
How do I select a particular row in kudu? Here i expect only one row to be returned.
val table2 : KuduTable = kuduClient.openTable("event-sets")
val eventColumns: util.List[String] = List(
OccurrenceSchema.SetId.name,
OccurrenceSchema.Period.name,
OccurrenceSchema.Event.name,
OccurrenceSchema.Date.name).asJava
val end:PartialRow = table2.getSchema.newPartialRow()
end.addInt(OccurrenceSchema.Period.name,1476)
end.addInt(OccurrenceSchema.SetId.name,82)
end.addInt(OccurrenceSchema.Event.name,3195167)
end.addLong(OccurrenceSchema.Date.name,1367922840000L)
val kuduScanner: KuduScanner = kuduClient.newScannerBuilder(table2)
.setProjectedColumnNames(eventColumns)
.lowerBound(end)
.exclusiveUpperBound((end))
.build()
assert(kuduScanner.hasMoreRows)
while (kuduScanner.hasMoreRows) {
val resultIterator: RowResultIterator = kuduScanner.nextRows()
while (resultIterator.hasNext) {
val result: RowResult = resultIterator.next()
assert(result != null)
logger.info(" : SetId Value -- " + result.getInt(OccurrenceSchema.SetId.name))
logger.info(" : Period Value -- " + result.getInt(OccurrenceSchema.Period.name))
logger.info(" : Event Value -- " + result.getInt(OccurrenceSchema.Event.name))
logger.info(" : Date Value -- " + result.getLong(OccurrenceSchema.Date.name))
}
}
From my understanding, you are looking for eaxcly one record in your table.
Using a scanner and defining bounds and / or a limit with didn't worked for me either. Instead I solved the problem by defining a KuduPredicate.
Below you will find my solution.
val builder: KuduScannerBuilder = kuduClient.newScannerBuilder(table2)
// define columns, you want to select
builder.setProjectedColumnNames(eventColumns)
// add predicates to select a record by primary key
val pkPeriod: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.Period.name), KuduPredicate.ComparisonOp.EQUAL, 1476)
builder.addPredicate(pkPeriod)
val pkSetId: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.SetId.name), KuduPredicate.ComparisonOp.EQUAL, 82)
builder.addPredicate(pkSetId)
val pkEvent: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.Event.name), KuduPredicate.ComparisonOp.EQUAL, 3195167)
builder.addPredicate(pkEvent)
val pkDate: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.Date.name), KuduPredicate.ComparisonOp.EQUAL, 1367922840000L)
builder.addPredicate(pkDate)
val kuduScanner: KuduScanner = builder.build()
while (kuduScanner.hasMoreRows) {
val resultIterator: RowResultIterator = kuduScanner.nextRows()
while (resultIterator.hasNext) {
val result: RowResult = resultIterator.next()
// do whatever you have to do with the selected record
logger.info(" : SetId Value -- " + result.getInt(OccurrenceSchema.SetId.name))
}
}
I'm new to Kudu, therefore I'm not sure, whether this solution is the most efficient one. At least, it returns the expected result.
My original code is written and tested in Java. I have ported it manually to Scala but I haven't tested it so far!

Related Links

Azure AppService deploy.cmd using the wrong file
Filtering a specific row in kudu using kudu scanner

Categories

HOME
android
oracle12c
design-patterns
c++11
pyqt
pyqt4
echo
tvos
jax-ws
categorical-data
appcelerator-studio
activecollab
cumulocity
standards
chargify
android-7.0-nougat
wolframalpha
smartclient
poedit
joptionpane
maze
fable-f#
corpus
nstimer
ceylon
postback
kudu
left-join
cart
html-encode
yowsup
cd
sessionstorage
odroid
linq-to-objects
metawidget
bindingnavigator
monkey-testing
puredata
dynamic-linq
clarifai
qtranslate
filehandle
rackspace-cloud
shinobi
filenet
rmongo
materialized-path-pattern
node-request
envi
spring-cloud-feign
mongodb-php
windows-azure-queues
nikeplus-api
android-wake-lock
cassia
tracker-enabled-dbcontext
project-organization
redis-cluster
pushbots
nio2
httpruntime.cache
rasterize
highland.js
rewrite
rubaxa-sortable
ejb-2.x
at-job
packet-sniffers
sql-scripts
alternate
paste
vsx
drupal-nodes
artemis
light
away3d
nebula
composite-component
windows-98
idn
oracle-enterprise-linux
gobject-introspection
sifr
cuda.net
soappy
commercial-application

Resources

Database Users
RDBMS discuss
Database Dev&Adm
javascript
java
csharp
php
android
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App