json - How to Join 2 different variables in Pig? -
i total newbie pig , have written following pig script:
define format `format_text.py $emoji $acronym` ship ('$stream_file_path/format_text.py'); define parse `parse.sh` ship ('$stream_file_path_syntaxnet/parse.sh'); define process_roots `process_roots.py` ship ('$stream_file_path_syntaxnet/process_roots.py'); input_data = load '$data_input'; result1 = stream input_data through format; result2 = stream result1 through parse; result3 = stream result2 through process_roots; result4 = foreach result1 generate concat (result1, result3); store result1 '$data_output'; store result2 '$syntaxnet_output'; store result4 '$syntaxnet_results'; so, input_data json file of tweet.
formatformats "text" field of json clean tweet.parseruns cleaned json through syntaxnet generate dependency relations. outputresult2looks like:
2 bank _ noun nnp _ 3 nn _ _
for each word of tweet (where second column word).
process_rootsmore processing onresult2, generatesresult3json field looks like:
avl_tags_syntaxnet: [{'pos_tag': 'nnp', 'position': '1', 'dep_rel': 'nn', 'parent': '3', 'word': 'us'}, ....................... {'pos_tag': '.', 'position': '30', 'dep_rel': 'punct', 'parent': '23', 'word': '...'}]
now, want append newly created json field (result3) result1 , store somewhere. read concat in pig , wrote code result4 in pig script throws error. please tell me right way it.
concatenation fields, trying combine multiple aliases.
the way combine aliases join.
based on how code structured, use (or add) unique identifier each line remains intact during various operations. in final phase join of them together.
Comments
Post a Comment