json - How to Join 2 different variables in Pig? -
i total newbie pig , have written following pig script:
define format `format_text.py $emoji $acronym` ship ('$stream_file_path/format_text.py'); define parse `parse.sh` ship ('$stream_file_path_syntaxnet/parse.sh'); define process_roots `process_roots.py` ship ('$stream_file_path_syntaxnet/process_roots.py'); input_data = load '$data_input'; result1 = stream input_data through format; result2 = stream result1 through parse; result3 = stream result2 through process_roots; result4 = foreach result1 generate concat (result1, result3); store result1 '$data_output'; store result2 '$syntaxnet_output'; store result4 '$syntaxnet_results';
so, input_data
json file of tweet.
format
formats "text" field of json clean tweet.parse
runs cleaned json through syntaxnet generate dependency relations. outputresult2
looks like:
2 bank _ noun nnp _ 3 nn _ _
for each word of tweet (where second column word).
process_roots
more processing onresult2
, generatesresult3
json field looks like:
avl_tags_syntaxnet: [{'pos_tag': 'nnp', 'position': '1', 'dep_rel': 'nn', 'parent': '3', 'word': 'us'}, ....................... {'pos_tag': '.', 'position': '30', 'dep_rel': 'punct', 'parent': '23', 'word': '...'}]
now, want append newly created json field (result3
) result1
, store somewhere. read concat
in pig , wrote code result4
in pig script throws error. please tell me right way it.
concatenation fields, trying combine multiple aliases.
the way combine aliases join.
based on how code structured, use (or add) unique identifier each line remains intact during various operations. in final phase join of them together.
Comments
Post a Comment