Tuesday, 13 August 2013

Update Mongo Collection Using hadoop-mongo & PIG

Update Mongo Collection Using hadoop-mongo & PIG

I am using mongo-hadoop project and Using PIG to achieve the below.
Right now, i have 2 pig LOAD statements, both has catid as common field
all other fields are unique. I want 1 record to be created but with fields
from both.
For example:
Collection1: { catid, key1, key2 }
Collection2: { catid, key3, key4 }
and the output to be stored in mongo collection as:
_id, catid, key1, key2, key3, key4.
I tried:
STORE A INTO '$DB.tablename' USING
com.mongodb.hadoop.pig.MongoStorage('update
[catid]','{catid:1},{unique:false}');
STORE B INTO '$DB.tablename' USING
com.mongodb.hadoop.pig.MongoStorage('update
[catid]','{catid:1},{unique:false}');
but it always inserts. A, B has 10 records having common catid, the output
in mongo is 20 records. It doesnt upsert. Any help on this? Thanks.

No comments:

Post a Comment