如何从元组中提取嵌套元组?
我正在使用 snscrape 来抓取 Instagram。 snscrape 以元组格式返回数据,但它在嵌套元组中创建 Instagram 数据。例如。
for b in enumerate(sninstagram.InstagramUserScraper(username='houston_2731').get_items()):
print[(b)]
输出
(0, InstagramPost(url='https://www.instagram.com/p/CUdFfjEImHN/', date=datetime.datetime(2021, 9, 30, 17, 39, 20, tzinfo=datetime.timezone.utc), content='"Hardwork plus patience. A symbol of my sacrifice I\'m doing waiting." Nipsey Hussle \n\nIt\'s hard to believe what 5 months and a disciplined diet and hitting the gym hard can do. The first pic in the collage is me at a challenging point in my life. Depression and what not but I had to snap out of it and get in the gym and do the work. As I continue to embark on this fitness journey. I hope to inspire some to join me on this journey. \n\n#fitness #weightloss #muscles #gymmotivation #gymrat #intermittentfasting #fitnessmotivation #fitnessjourney #tenpercentbodyfat #shredded #fitnessgoals #hardwork #patience #discipline #dedication #hunger', thumbnailUrl='https://instagram.fjnb12-1.fna.fbcdn.net/v/t51.2885-15/243385646_584565779558058_6508985384396360110_n.webp?stp=dst-jpg_e35_s640x640_sh0.08&_nc_ht=instagram.fjnb12-1.fna.fbcdn.net&_nc_cat=106&_nc_ohc=nrtaOwxdg64AX8NQE-Z&edm=ABfd0MgBAAAA&ccb=7-4&oh=00_AT_xE-O75IP4MezdzoHM_WxAgbXiivb3aBFUMopAkxxJSA&oe=621D237E&_nc_sid=7bff83', displayUrl='https://instagram.fjnb12-1.fna.fbcdn.net/v/t51.2885-15/243385646_584565779558058_6508985384396360110_n.webp?stp=dst-jpg_e35&_nc_ht=instagram.fjnb12-1.fna.fbcdn.net&_nc_cat=106&_nc_ohc=nrtaOwxdg64AX8NQE-Z&edm=ABfd0MgBAAAA&ccb=7-4&oh=00_AT8JXpM2XKqA_d06LV10Qy_Jt1GYnvpjUEeVZZMRIdwgnQ&oe=621D237E&_nc_sid=7bff83', username='houston_2731', likes=1, comments=0, commentsDisabled=False, isVideo=False))
现在,由于这个原因,该输出无法插入到数据库中,因为它会由于其类型而创建由嵌套元组引起的值错误。数据库无法识别其类型,然后失败。所以现在我想做的是提取嵌套元组并将其用作主元组。我该如何去做呢?
class insta():
def instagram(self):
dbname = '******'
user = '******'
password = '******'
host = '******'
port = ****
cur = None
conn = None
try:
conn = psycopg2.connect(
dbname = dbname,
user = user,
password = password,
host = host,
port = port
)
cur = conn.cursor()
cur.execute('DROP TABLE IF EXISTS Machine_instagram')
create_table = '''CREATE TABLE IF NOT EXISTS Machine_instagram (
id serial PRIMARY KEY,
url char,
date timestamp,
content char,
thumbnailUrl char,
displayUrl char,
username char,
likes int,
comments int,
commentsDisabled bool,
isVideo bool)'''
cur.execute(create_table)
for b in enumerate(sninstagram.InstagramUserScraper(username='houston_2731').get_items()):
insert_insta = 'INSERT INTO Machine_instagram (url, date, content,thumbnailUrl, displayUrl, username, likes, comments, commentsDisabled, isVideo) VALUES (%s, %s, %s, %s,%s, %s, %s, %s, %s, %s)'
insert_values = [(b)]
for records in insert_values:
cur.execute(insert_insta, records)
conn.commit()
print('completed')
except Exception as error:
print(error)
finally:
if cur is not None:
cur.close()
if conn is not None:
conn.close()
insta1 = insta()
insta1.instagram()
I'm using snscrape to scrape instagram. snscrape returns the data in tuple format but it creates the instagram data in a nested tuple. eg.
for b in enumerate(sninstagram.InstagramUserScraper(username='houston_2731').get_items()):
print[(b)]
output
(0, InstagramPost(url='https://www.instagram.com/p/CUdFfjEImHN/', date=datetime.datetime(2021, 9, 30, 17, 39, 20, tzinfo=datetime.timezone.utc), content='"Hardwork plus patience. A symbol of my sacrifice I\'m doing waiting." Nipsey Hussle \n\nIt\'s hard to believe what 5 months and a disciplined diet and hitting the gym hard can do. The first pic in the collage is me at a challenging point in my life. Depression and what not but I had to snap out of it and get in the gym and do the work. As I continue to embark on this fitness journey. I hope to inspire some to join me on this journey. \n\n#fitness #weightloss #muscles #gymmotivation #gymrat #intermittentfasting #fitnessmotivation #fitnessjourney #tenpercentbodyfat #shredded #fitnessgoals #hardwork #patience #discipline #dedication #hunger', thumbnailUrl='https://instagram.fjnb12-1.fna.fbcdn.net/v/t51.2885-15/243385646_584565779558058_6508985384396360110_n.webp?stp=dst-jpg_e35_s640x640_sh0.08&_nc_ht=instagram.fjnb12-1.fna.fbcdn.net&_nc_cat=106&_nc_ohc=nrtaOwxdg64AX8NQE-Z&edm=ABfd0MgBAAAA&ccb=7-4&oh=00_AT_xE-O75IP4MezdzoHM_WxAgbXiivb3aBFUMopAkxxJSA&oe=621D237E&_nc_sid=7bff83', displayUrl='https://instagram.fjnb12-1.fna.fbcdn.net/v/t51.2885-15/243385646_584565779558058_6508985384396360110_n.webp?stp=dst-jpg_e35&_nc_ht=instagram.fjnb12-1.fna.fbcdn.net&_nc_cat=106&_nc_ohc=nrtaOwxdg64AX8NQE-Z&edm=ABfd0MgBAAAA&ccb=7-4&oh=00_AT8JXpM2XKqA_d06LV10Qy_Jt1GYnvpjUEeVZZMRIdwgnQ&oe=621D237E&_nc_sid=7bff83', username='houston_2731', likes=1, comments=0, commentsDisabled=False, isVideo=False))
Now for this reason specifically this output cannot be inserted into the database because it creates a value error caused by the nested tuple because of its type. the database doesn't recognize its type and then fails. so now what I want to do is extract the nested tuple and use it as the main tuple. How do I go about doing that?
class insta():
def instagram(self):
dbname = '******'
user = '******'
password = '******'
host = '******'
port = ****
cur = None
conn = None
try:
conn = psycopg2.connect(
dbname = dbname,
user = user,
password = password,
host = host,
port = port
)
cur = conn.cursor()
cur.execute('DROP TABLE IF EXISTS Machine_instagram')
create_table = '''CREATE TABLE IF NOT EXISTS Machine_instagram (
id serial PRIMARY KEY,
url char,
date timestamp,
content char,
thumbnailUrl char,
displayUrl char,
username char,
likes int,
comments int,
commentsDisabled bool,
isVideo bool)'''
cur.execute(create_table)
for b in enumerate(sninstagram.InstagramUserScraper(username='houston_2731').get_items()):
insert_insta = 'INSERT INTO Machine_instagram (url, date, content,thumbnailUrl, displayUrl, username, likes, comments, commentsDisabled, isVideo) VALUES (%s, %s, %s, %s,%s, %s, %s, %s, %s, %s)'
insert_values = [(b)]
for records in insert_values:
cur.execute(insert_insta, records)
conn.commit()
print('completed')
except Exception as error:
print(error)
finally:
if cur is not None:
cur.close()
if conn is not None:
conn.close()
insta1 = insta()
insta1.instagram()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论