kudu


Filtering a specific row in kudu using kudu scanner


The target table in kudu is huge. I have the following in scala and I would like to check if the row exists in kudu. These four columns are primary keys in kudu table but when I define a upper bound I seem to get all the rows.
How do I select a particular row in kudu? Here i expect only one row to be returned.
val table2 : KuduTable = kuduClient.openTable("event-sets")
val eventColumns: util.List[String] = List(
OccurrenceSchema.SetId.name,
OccurrenceSchema.Period.name,
OccurrenceSchema.Event.name,
OccurrenceSchema.Date.name).asJava
val end:PartialRow = table2.getSchema.newPartialRow()
end.addInt(OccurrenceSchema.Period.name,1476)
end.addInt(OccurrenceSchema.SetId.name,82)
end.addInt(OccurrenceSchema.Event.name,3195167)
end.addLong(OccurrenceSchema.Date.name,1367922840000L)
val kuduScanner: KuduScanner = kuduClient.newScannerBuilder(table2)
.setProjectedColumnNames(eventColumns)
.lowerBound(end)
.exclusiveUpperBound((end))
.build()
assert(kuduScanner.hasMoreRows)
while (kuduScanner.hasMoreRows) {
val resultIterator: RowResultIterator = kuduScanner.nextRows()
while (resultIterator.hasNext) {
val result: RowResult = resultIterator.next()
assert(result != null)
logger.info(" : SetId Value -- " + result.getInt(OccurrenceSchema.SetId.name))
logger.info(" : Period Value -- " + result.getInt(OccurrenceSchema.Period.name))
logger.info(" : Event Value -- " + result.getInt(OccurrenceSchema.Event.name))
logger.info(" : Date Value -- " + result.getLong(OccurrenceSchema.Date.name))
}
}
From my understanding, you are looking for eaxcly one record in your table.
Using a scanner and defining bounds and / or a limit with didn't worked for me either. Instead I solved the problem by defining a KuduPredicate.
Below you will find my solution.
val builder: KuduScannerBuilder = kuduClient.newScannerBuilder(table2)
// define columns, you want to select
builder.setProjectedColumnNames(eventColumns)
// add predicates to select a record by primary key
val pkPeriod: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.Period.name), KuduPredicate.ComparisonOp.EQUAL, 1476)
builder.addPredicate(pkPeriod)
val pkSetId: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.SetId.name), KuduPredicate.ComparisonOp.EQUAL, 82)
builder.addPredicate(pkSetId)
val pkEvent: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.Event.name), KuduPredicate.ComparisonOp.EQUAL, 3195167)
builder.addPredicate(pkEvent)
val pkDate: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.Date.name), KuduPredicate.ComparisonOp.EQUAL, 1367922840000L)
builder.addPredicate(pkDate)
val kuduScanner: KuduScanner = builder.build()
while (kuduScanner.hasMoreRows) {
val resultIterator: RowResultIterator = kuduScanner.nextRows()
while (resultIterator.hasNext) {
val result: RowResult = resultIterator.next()
// do whatever you have to do with the selected record
logger.info(" : SetId Value -- " + result.getInt(OccurrenceSchema.SetId.name))
}
}
I'm new to Kudu, therefore I'm not sure, whether this solution is the most efficient one. At least, it returns the expected result.
My original code is written and tested in Java. I have ported it manually to Scala but I haven't tested it so far!

Related Links

Azure AppService deploy.cmd using the wrong file
Filtering a specific row in kudu using kudu scanner

Categories

HOME
xml
gdb
apache-flink
mjml
rsa
apiblueprint
alarm
opendj
tivoli
hugo
mapbox
recordrtc
syntaxnet
mql5
heap
php-5.6
openbugs
nancy
3nf
selectize.js
libusb
bitnami
capistrano3
esoteric-languages
powershell-v5.0
google-content-api
pagespeed
aggregation
interpreter
strongname
maze
custom-post-type
pylons
rpgle
cppreference
google-earth
jmockit
clickjacking
getpixel
carriage-return
engine.io
swagger-php
filesystemwatcher
setup.py
apache-stanbol
doc
postgresql-8.4
apollostack
procfile
magento-2.0.7
openrasta
bluez
rxtx
autosar
ziparchive
preprocessor
envi
macaulay2
simevents
embedding
php-socket
query-by-example
custom-errors
gocql
rpart
eoferror
broadband
java-security
bcache
cryptographichashfunction
filesplitting
verold
grails-domain-class
live555
dct
authlogic
ecslidingviewcontroller
high-resolution
gwt-openlayers
notorm
sphinxql
dtmf
ebay-lms
beaker-testing
ui-select2
facebook-timeline
pstack
qtconcurrent
extconf.rb
jquery-tools
pylucene
clrstoredprocedure
cloudfiles
qglwidget
office-2007
nstokenfield
dsl-tools
interop-domino

Resources

Encrypt Message