Input data is in the following format
key is the imageid for which nearest neighbor will be computed
the value is 100 dimensional vector of floating point values separated by space or tab
The mapper reads in the query (the query is a 100 dimensional vector) and each line of the input and outputs a
where key2 is a floating point value indicating the distance, and value2 is the imageid
The number of reducers is set to 1. And the reducer is set to be the identity reducer.
I tried to use the following command
bin/hadoop jar ./mapred/contrib/streaming/
This is the output stream is as below. The failure is in the mapper itself, more specifically the TEXTOUTPUTREADER. I am not sure how to fix this. The logs are attached below:
11/04/13 13:22:15 INFO security.Groups: Group mapping impl=org.apache.hadoop.
11/04/13 13:22:15 WARN conf.Configuration: mapred.used.
STREAM: addTaskEnvironment=
STREAM: shippedCanonFiles_=[]
STREAM: shipped: false /usr/local/hadoop/file1
STREAM: cmd=file1
STREAM: cmd=null
STREAM: shipped: false /usr/local/hadoop/org.apache.
STREAM: cmd=org.apache.hadoop.mapred.
11/04/13 13:22:15 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
STREAM: Found runtime classes in: /usr/local/hadoop-hadoop/
packageJobJar: [/usr/local/hadoop-hadoop/
JarBuilder.addNamedStream META-INF/MANIFEST.MF
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/typedbytes/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
JarBuilder.addNamedStream org/apache/hadoop/streaming/
STREAM: ==== JobConf properties:
STREAM: dfs.block.access.key.update.
STREAM: dfs.block.access.token.enable=
STREAM: dfs.block.access.token.
STREAM: dfs.blockreport.initialDelay=0
STREAM: dfs.blockreport.intervalMsec=
STREAM: dfs.blocksize=67108864
STREAM: dfs.bytes-per-checksum=512
STREAM: dfs.client-write-packet-size=
STREAM: dfs.client.block.write.
STREAM: dfs.client.https.keystore.
STREAM: dfs.client.https.need-auth=
STREAM: dfs.datanode.address=0.0.0.0:
STREAM: dfs.datanode.balance.
STREAM: dfs.datanode.data.dir=file://$
STREAM: dfs.datanode.data.dir.perm=755
STREAM: dfs.datanode.directoryscan.
STREAM: dfs.datanode.directoryscan.
STREAM: dfs.datanode.dns.interface=
STREAM: dfs.datanode.dns.nameserver=
STREAM: dfs.datanode.du.reserved=0
STREAM: dfs.datanode.failed.volumes.
STREAM: dfs.datanode.handler.count=3
STREAM: dfs.datanode.http.address=0.0.
STREAM: dfs.datanode.https.address=0.
STREAM: dfs.datanode.ipc.address=0.0.
STREAM: dfs.default.chunk.view.size=
STREAM: dfs.heartbeat.interval=3
STREAM: dfs.https.enable=false
STREAM: dfs.https.server.keystore.
STREAM: dfs.namenode.accesstime.
STREAM: dfs.namenode.backup.address=0.
STREAM: dfs.namenode.backup.http-
STREAM: dfs.namenode.checkpoint.dir=
STREAM: dfs.namenode.checkpoint.edits.
STREAM: dfs.namenode.checkpoint.
STREAM: dfs.namenode.checkpoint.size=
STREAM: dfs.namenode.decommission.
STREAM: dfs.namenode.decommission.
STREAM: dfs.namenode.delegation.key.
STREAM: dfs.namenode.delegation.token.
STREAM: dfs.namenode.delegation.token.
STREAM: dfs.namenode.edits.dir=${dfs.
STREAM: dfs.namenode.handler.count=10
STREAM: dfs.namenode.http-address=0.0.
STREAM: dfs.namenode.https-address=0.
STREAM: dfs.namenode.logging.level=
STREAM: dfs.namenode.max.objects=0
STREAM: dfs.namenode.name.dir=file://$
STREAM: dfs.namenode.replication.
STREAM: dfs.namenode.replication.
STREAM: dfs.namenode.replication.min=1
STREAM: dfs.namenode.safemode.
STREAM: dfs.namenode.safemode.
STREAM: dfs.namenode.secondary.http-
STREAM: dfs.permissions.enabled=true
STREAM: dfs.permissions.
STREAM: dfs.replication=1
STREAM: dfs.replication.max=512
STREAM: dfs.stream-buffer-size=4096
STREAM: dfs.web.ugi=webuser,webgroup
STREAM: file.blocksize=67108864
STREAM: file.bytes-per-checksum=512
STREAM: file.client-write-packet-size=
STREAM: file.replication=1
STREAM: file.stream-buffer-size=4096
STREAM: fs.AbstractFileSystem.file.
STREAM: fs.AbstractFileSystem.hdfs.
STREAM: fs.automatic.close=true
STREAM: fs.checkpoint.dir=${hadoop.
STREAM: fs.checkpoint.edits.dir=${fs.
STREAM: fs.checkpoint.period=3600
STREAM: fs.checkpoint.size=67108864
STREAM: fs.defaultFS=hdfs://localhost:
STREAM: fs.df.interval=60000
STREAM: fs.file.impl=org.apache.
STREAM: fs.ftp.impl=org.apache.hadoop.
STREAM: fs.har.impl=org.apache.hadoop.
STREAM: fs.har.impl.disable.cache=true
STREAM: fs.hdfs.impl=org.apache.
STREAM: fs.hftp.impl=org.apache.
STREAM: fs.hsftp.impl=org.apache.
STREAM: fs.kfs.impl=org.apache.hadoop.
STREAM: fs.ramfs.impl=org.apache.
STREAM: fs.s3.block.size=67108864
STREAM: fs.s3.buffer.dir=${hadoop.tmp.
STREAM: fs.s3.impl=org.apache.hadoop.
STREAM: fs.s3.maxRetries=4
STREAM: fs.s3.sleepTimeSeconds=10
STREAM: fs.s3n.block.size=67108864
STREAM: fs.s3n.impl=org.apache.hadoop.
STREAM: fs.trash.interval=0
STREAM: ftp.blocksize=67108864
STREAM: ftp.bytes-per-checksum=512
STREAM: ftp.client-write-packet-size=
STREAM: ftp.replication=3
STREAM: ftp.stream-buffer-size=4096
STREAM: hadoop.common.configuration.
STREAM: hadoop.hdfs.configuration.
STREAM: hadoop.logfile.count=10
STREAM: hadoop.logfile.size=10000000
STREAM: hadoop.rpc.socket.factory.
STREAM: hadoop.security.
STREAM: hadoop.security.authorization=
STREAM: hadoop.tmp.dir=/usr/local/
STREAM: hadoop.util.hash.type=murmur
STREAM: io.bytes.per.checksum=512
STREAM: io.compression.codecs=org.
STREAM: io.file.buffer.size=4096
STREAM: io.map.index.skip=0
STREAM: io.mapfile.bloom.error.rate=0.
STREAM: io.mapfile.bloom.size=1048576
STREAM: io.native.lib.available=true
STREAM: io.seqfile.compress.blocksize=
STREAM: io.seqfile.lazydecompress=true
STREAM: io.seqfile.local.dir=${hadoop.
STREAM: io.seqfile.sorter.recordlimit=
STREAM: io.serializations=org.apache.
STREAM: io.skip.checksum.errors=false
STREAM: ipc.client.connect.max.
STREAM: ipc.client.connection.
STREAM: ipc.client.idlethreshold=4000
STREAM: ipc.client.kill.max=10
STREAM: ipc.client.tcpnodelay=false
STREAM: ipc.server.listen.queue.size=
STREAM: ipc.server.tcpnodelay=false
STREAM: kfs.blocksize=67108864
STREAM: kfs.bytes-per-checksum=512
STREAM: kfs.client-write-packet-size=
STREAM: kfs.replication=3
STREAM: kfs.stream-buffer-size=4096
STREAM: map.sort.class=org.apache.
STREAM: mapred.child.java.opts=-
STREAM: mapred.input.format.class=org.
STREAM: mapred.map.runner.class=org.
STREAM: mapred.mapper.class=org.
STREAM: mapred.output.format.class=
STREAM: mapred.reducer.class=org.
STREAM: mapreduce.client.completion.
STREAM: mapreduce.client.
STREAM: mapreduce.client.output.
STREAM: mapreduce.client.
STREAM: mapreduce.client.submit.file.
STREAM: mapreduce.cluster.local.dir=${
STREAM: mapreduce.cluster.temp.dir=${
STREAM: mapreduce.input.
STREAM: mapreduce.input.
STREAM: mapreduce.job.cache.symlink.
STREAM: mapreduce.job.committer.setup.
STREAM: mapreduce.job.complete.cancel.
STREAM: mapreduce.job.end-
STREAM: mapreduce.job.end-
STREAM: mapreduce.job.jar=/tmp/
STREAM: mapreduce.job.jvm.numtasks=1
STREAM: mapreduce.job.maps=2
STREAM: mapreduce.job.maxtaskfailures.
STREAM: mapreduce.job.output.key.
STREAM: mapreduce.job.output.value.
STREAM: mapreduce.job.queuename=
STREAM: mapreduce.job.reduce.
STREAM: mapreduce.job.reduces=1
STREAM: mapreduce.job.speculative.
STREAM: mapreduce.job.speculative.
STREAM: mapreduce.job.speculative.
STREAM: mapreduce.job.split.metainfo.
STREAM: mapreduce.job.userlog.retain.
STREAM: mapreduce.job.working.dir=
STREAM: mapreduce.jobtracker.address=
STREAM: mapreduce.jobtracker.expire.
STREAM: mapreduce.jobtracker.handler.
STREAM: mapreduce.jobtracker.
STREAM: mapreduce.jobtracker.http.
STREAM: mapreduce.jobtracker.
STREAM: mapreduce.jobtracker.
STREAM: mapreduce.jobtracker.
STREAM: mapreduce.jobtracker.maxtasks.
STREAM: mapreduce.jobtracker.persist.
STREAM: mapreduce.jobtracker.persist.
STREAM: mapreduce.jobtracker.persist.
STREAM: mapreduce.jobtracker.restart.
STREAM: mapreduce.jobtracker.
STREAM: mapreduce.jobtracker.staging.
STREAM: mapreduce.jobtracker.system.
STREAM: mapreduce.jobtracker.
STREAM: mapreduce.jobtracker.
STREAM: mapreduce.jobtracker.
STREAM: mapreduce.map.log.level=INFO
STREAM: mapreduce.map.maxattempts=4
STREAM: mapreduce.map.output.compress=
STREAM: mapreduce.map.output.compress.
STREAM: mapreduce.map.output.key.
STREAM: mapreduce.map.output.value.
STREAM: mapreduce.map.skip.maxrecords=
STREAM: mapreduce.map.skip.proc.count.
STREAM: mapreduce.map.sort.spill.
STREAM: mapreduce.map.speculative=true
STREAM: mapreduce.output.
STREAM: mapreduce.output.
STREAM: mapreduce.output.
STREAM: mapreduce.output.
STREAM: mapreduce.reduce.input.buffer.
STREAM: mapreduce.reduce.log.level=
STREAM: mapreduce.reduce.markreset.
STREAM: mapreduce.reduce.maxattempts=4
STREAM: mapreduce.reduce.merge.inmem.
STREAM: mapreduce.reduce.shuffle.
STREAM: mapreduce.reduce.shuffle.
STREAM: mapreduce.reduce.shuffle.
STREAM: mapreduce.reduce.shuffle.
STREAM: mapreduce.reduce.shuffle.read.
STREAM: mapreduce.reduce.skip.
STREAM: mapreduce.reduce.skip.proc.
STREAM: mapreduce.reduce.speculative=
STREAM: mapreduce.task.files.preserve.
STREAM: mapreduce.task.io.sort.factor=
STREAM: mapreduce.task.io.sort.mb=100
STREAM: mapreduce.task.merge.progress.
STREAM: mapreduce.task.profile=false
STREAM: mapreduce.task.profile.maps=0-
STREAM: mapreduce.task.profile.
STREAM: mapreduce.task.skip.start.
STREAM: mapreduce.task.timeout=600000
STREAM: mapreduce.task.tmp.dir=./tmp
STREAM: mapreduce.task.userlog.limit.
STREAM: mapreduce.tasktracker.cache.
STREAM: mapreduce.tasktracker.dns.
STREAM: mapreduce.tasktracker.dns.
STREAM: mapreduce.tasktracker.
STREAM: mapreduce.tasktracker.
STREAM: mapreduce.tasktracker.http.
STREAM: mapreduce.tasktracker.http.
STREAM: mapreduce.tasktracker.
STREAM: mapreduce.tasktracker.
STREAM: mapreduce.tasktracker.local.
STREAM: mapreduce.tasktracker.local.
STREAM: mapreduce.tasktracker.map.
STREAM: mapreduce.tasktracker.
STREAM: mapreduce.tasktracker.reduce.
STREAM: mapreduce.tasktracker.report.
STREAM: mapreduce.tasktracker.
STREAM: mapreduce.tasktracker.
STREAM: mapreduce.tasktracker.tasks.
STREAM: net.topology.node.switch.
STREAM: net.topology.script.number.
STREAM: s3.blocksize=67108864
STREAM: s3.bytes-per-checksum=512
STREAM: s3.client-write-packet-size=
STREAM: s3.replication=3
STREAM: s3.stream-buffer-size=4096
STREAM: s3native.blocksize=67108864
STREAM: s3native.bytes-per-checksum=
STREAM: s3native.client-write-packet-
STREAM: s3native.replication=3
STREAM: s3native.stream-buffer-size=
STREAM: stream.addenvironment=
STREAM: stream.map.input.writer.class=
STREAM: stream.map.output.reader.
STREAM: stream.map.streamprocessor=
STREAM: stream.numinputspecs=1
STREAM: stream.reduce.input.writer.
STREAM: stream.reduce.output.reader.
STREAM: tmpfiles=file:/home/shivani/
STREAM: webinterface.private.actions=
STREAM: ====
STREAM: submitting to jobconf: localhost:54311
11/04/13 13:22:17 INFO mapred.FileInputFormat: Total input paths to process : 1
11/04/13 13:22:17 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
11/04/13 13:22:17 INFO mapreduce.JobSubmitter: number of splits:2
11/04/13 13:22:17 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null
11/04/13 13:22:17 INFO streaming.StreamJob: getLocalDirs(): [/usr/local/hadoop-hadoop/
11/04/13 13:22:17 INFO streaming.StreamJob: Running job: job_201104131251_0002
11/04/13 13:22:17 INFO streaming.StreamJob: To kill this job, run:
11/04/13 13:22:17 INFO streaming.StreamJob: /usr/local/hadoop/bin/hadoop job -Dmapreduce.jobtracker.
11/04/13 13:22:17 INFO streaming.StreamJob: Tracking URL: http://localhost:50030/
11/04/13 13:22:18 INFO streaming.StreamJob: map 0% reduce 0%
11/04/13 13:23:19 INFO streaming.StreamJob: map 100% reduce 100%
11/04/13 13:23:19 INFO streaming.StreamJob: To kill this job, run:
11/04/13 13:23:19 INFO streaming.StreamJob: /usr/local/hadoop/bin/hadoop job -Dmapreduce.jobtracker.
11/04/13 13:23:19 INFO streaming.StreamJob: Tracking URL: http://localhost:50030/
11/04/13 13:23:19 ERROR streaming.StreamJob: Job not Successful!
11/04/13 13:23:19 INFO streaming.StreamJob: killJob...
Streaming Command Failed!
I looked at the output of the mapper and it fails
ava.lang.NullPointerException at
java.lang.String.
org.apache.hadoop.streaming.
org.apache.hadoop.streaming.
org.apache.hadoop.streaming.
org.apache.hadoop.streaming.
org.apache.hadoop.mapred.
org.apache.hadoop.streaming.
org.apache.hadoop.mapred.
org.apache.hadoop.mapred.
org.apache.hadoop.mapred.
java.security.
javax.security.auth.Subject.
org.apache.hadoop.security.
org.apache.hadoop.mapred.
No comments:
Post a Comment