Skip to content Skip to sidebar Skip to footer
Showing posts with the label Hadoop

Why Am I Getting These Strange Connection Errors When Reading Or Writing To Hadoop File System With A Python Script?

I wrote a python code to read and write to a hadoop file system with IP hdfs_ip. It takes 3 argumen… Read more Why Am I Getting These Strange Connection Errors When Reading Or Writing To Hadoop File System With A Python Script?

Remove Empty Line Printed From Hive Query Output Using Python

i am performing a hive query and storing the output in a tsv file in the local FS. I am running a f… Read more Remove Empty Line Printed From Hive Query Output Using Python

Hadoop-streaming : Reduce Task In Pending State Says "no Room For Reduce Task."

My map task completes successfully and I can see the application logs, but reducer stays in pending… Read more Hadoop-streaming : Reduce Task In Pending State Says "no Room For Reduce Task."

How Does Spark Running On Yarn Account For Python Memory Usage?

After reading through the documentation I do not understand how does Spark running on YARN account … Read more How Does Spark Running On Yarn Account For Python Memory Usage?

Mapreduce How To Allow Mapper To Read An Xml File For Lookup

In my MapReduce jobs, I pass a product name to the Mapper as a string argument. The Mapper.py scrip… Read more Mapreduce How To Allow Mapper To Read An Xml File For Lookup

Encountered Ioexception While Registering Python Udf In Pig. File Helloworld.py Does Not Exist

Pytjon UDF : @outputSchema('word:chararray') def helloworld(): return 'Hello, World&#… Read more Encountered Ioexception While Registering Python Udf In Pig. File Helloworld.py Does Not Exist