python 3.x - Unicode characters to Turkish characters -


(edit: original question posted here, issue has been resolved , code below correct). looking advice on how convert unicode characters turkish characters. following code (posted online) scrapes tweets individual user , outputs csv file, turkish characters come out in unicode characters, i.e. \xc4. using python 3 on mac.

import sys  default_encoding = 'utf-8' if sys.getdefaultencoding() != default_encoding:     reload(sys)     sys.setdefaultencoding(default_encoding)  import tweepy #https://github.com/tweepy/tweepy import csv import string import print  #twitter api credentials consumer_key = "" consumer_secret = "" access_key = "" access_secret = ""  def get_all_tweets(screen_name): #twitter allows access users recent 3240 tweets method  #authorize twitter, initialize tweepy auth = tweepy.oauthhandler(consumer_key, consumer_secret) auth.set_access_token(access_key, access_secret) api = tweepy.api(auth)  #initialize list hold tweepy tweets alltweets = []    #make initial request recent tweets (200 maximum allowed count) new_tweets = api.user_timeline(screen_name = screen_name,count=200)  #save recent tweets alltweets.extend(new_tweets)  #save id of oldest tweet less 1 oldest = alltweets[-1].id - 1  #keep grabbing tweets until there no tweets left grab while len(new_tweets) > 0:     #print "getting tweets before %s" % (oldest)      #all subsiquent requests use max_id param prevent duplicates     new_tweets = api.user_timeline(screen_name =    screen_name,count=200,max_id=oldest)      #save recent tweets     alltweets.extend(new_tweets)      #update id of oldest tweet less 1     oldest = alltweets[-1].id - 1 

transform tweepy tweets 2d array populate csv

outtweets = [[tweet.id_str, tweet.created_at, tweet.text)] tweet in alltweets] 

write csv

with open('%s_tweets.csv', 'w', newline='', encoding='utf-8-sig') f:     writer = csv.writer(f)     writer.writerow(["id","created_at","text"])     writer.writerows(outtweets)  pass  if __name__ == '__main__': 

pass in username of account want download

get_all_tweets("") 

the csv module docs recommend specify encoding when open file. (and use newline='' csv module can own handling newlines). don't encode unicode strings when writing rows.

import csv  open('test.csv', 'w', newline='', encoding='utf-8') f:     writer = csv.writer(f)     writer.writerow(['id','created_at','text'])     writer.writerows([[123, 456, 'Äβç']]) 

Comments

Popular posts from this blog

javascript - Thinglink image not visible until browser resize -

firebird - Error "invalid transaction handle (expecting explicit transaction start)" executing script from Delphi -

mongodb - How to keep track of users making Stripe Payments -