如何从元组中提取嵌套元组?

发布于 2025-01-09 20:25:05 字数 3837 浏览 3 评论 0原文

我正在使用 snscrape 来抓取 Instagram。 snscrape 以元组格式返回数据,但它在嵌套元组中创建 Instagram 数据。例如。

for b in enumerate(sninstagram.InstagramUserScraper(username='houston_2731').get_items()):
        print[(b)]

输出

(0, InstagramPost(url='https://www.instagram.com/p/CUdFfjEImHN/', date=datetime.datetime(2021, 9, 30, 17, 39, 20, tzinfo=datetime.timezone.utc), content='"Hardwork plus patience. A symbol of my sacrifice I\'m doing waiting." Nipsey  Hussle \n\nIt\'s hard to believe what 5 months and a disciplined diet and hitting the gym hard can do. The first pic in the collage is me at a challenging point in my life. Depression and what not but I had to snap out of it and get in the gym and do the work. As I continue to embark on this fitness journey. I hope to inspire some to join me on this journey. \n\n#fitness #weightloss #muscles #gymmotivation #gymrat #intermittentfasting #fitnessmotivation #fitnessjourney #tenpercentbodyfat #shredded #fitnessgoals #hardwork #patience #discipline #dedication #hunger', thumbnailUrl='https://instagram.fjnb12-1.fna.fbcdn.net/v/t51.2885-15/243385646_584565779558058_6508985384396360110_n.webp?stp=dst-jpg_e35_s640x640_sh0.08&_nc_ht=instagram.fjnb12-1.fna.fbcdn.net&_nc_cat=106&_nc_ohc=nrtaOwxdg64AX8NQE-Z&edm=ABfd0MgBAAAA&ccb=7-4&oh=00_AT_xE-O75IP4MezdzoHM_WxAgbXiivb3aBFUMopAkxxJSA&oe=621D237E&_nc_sid=7bff83', displayUrl='https://instagram.fjnb12-1.fna.fbcdn.net/v/t51.2885-15/243385646_584565779558058_6508985384396360110_n.webp?stp=dst-jpg_e35&_nc_ht=instagram.fjnb12-1.fna.fbcdn.net&_nc_cat=106&_nc_ohc=nrtaOwxdg64AX8NQE-Z&edm=ABfd0MgBAAAA&ccb=7-4&oh=00_AT8JXpM2XKqA_d06LV10Qy_Jt1GYnvpjUEeVZZMRIdwgnQ&oe=621D237E&_nc_sid=7bff83', username='houston_2731', likes=1, comments=0, commentsDisabled=False, isVideo=False))

现在,由于这个原因,该输出无法插入到数据库中,因为它会由于其类型而创建由嵌套元组引起的值错误。数据库无法识别其类型,然后失败。所以现在我想做的是提取嵌套元组并将其用作主元组。我该如何去做呢?

class insta():

    def instagram(self):

        dbname = '******'
        user = '******'
        password = '******'
        host = '******' 
        port = ****
        cur = None
        conn = None
        
        try:
            conn = psycopg2.connect(
                    dbname = dbname,
                    user = user,
                    password = password,
                    host = host, 
                    port = port   
            )
            
            cur = conn.cursor()

            cur.execute('DROP TABLE IF EXISTS Machine_instagram')

            create_table =  '''CREATE TABLE IF NOT EXISTS Machine_instagram (
                id   serial PRIMARY KEY,
                url  char,
                date timestamp,
                content char,
                thumbnailUrl char,
                displayUrl char,
                username char,
                likes int,
                comments int,
                commentsDisabled bool,
                isVideo bool)'''

             cur.execute(create_table)

                for b in enumerate(sninstagram.InstagramUserScraper(username='houston_2731').get_items()):
                         insert_insta = 'INSERT INTO Machine_instagram (url, date, content,thumbnailUrl, displayUrl, username, likes, comments, commentsDisabled, isVideo) VALUES (%s, %s, %s, %s,%s, %s, %s, %s, %s, %s)'
                         insert_values = [(b)]
                for records in insert_values:
                    cur.execute(insert_insta, records)
            
             conn.commit()
             print('completed')

        except Exception as error:
                print(error)  

        finally:
            if cur is not None:
                cur.close()
            if conn is not None:
                conn.close()

insta1 = insta()

insta1.instagram()

I'm using snscrape to scrape instagram. snscrape returns the data in tuple format but it creates the instagram data in a nested tuple. eg.

for b in enumerate(sninstagram.InstagramUserScraper(username='houston_2731').get_items()):
        print[(b)]

output

(0, InstagramPost(url='https://www.instagram.com/p/CUdFfjEImHN/', date=datetime.datetime(2021, 9, 30, 17, 39, 20, tzinfo=datetime.timezone.utc), content='"Hardwork plus patience. A symbol of my sacrifice I\'m doing waiting." Nipsey  Hussle \n\nIt\'s hard to believe what 5 months and a disciplined diet and hitting the gym hard can do. The first pic in the collage is me at a challenging point in my life. Depression and what not but I had to snap out of it and get in the gym and do the work. As I continue to embark on this fitness journey. I hope to inspire some to join me on this journey. \n\n#fitness #weightloss #muscles #gymmotivation #gymrat #intermittentfasting #fitnessmotivation #fitnessjourney #tenpercentbodyfat #shredded #fitnessgoals #hardwork #patience #discipline #dedication #hunger', thumbnailUrl='https://instagram.fjnb12-1.fna.fbcdn.net/v/t51.2885-15/243385646_584565779558058_6508985384396360110_n.webp?stp=dst-jpg_e35_s640x640_sh0.08&_nc_ht=instagram.fjnb12-1.fna.fbcdn.net&_nc_cat=106&_nc_ohc=nrtaOwxdg64AX8NQE-Z&edm=ABfd0MgBAAAA&ccb=7-4&oh=00_AT_xE-O75IP4MezdzoHM_WxAgbXiivb3aBFUMopAkxxJSA&oe=621D237E&_nc_sid=7bff83', displayUrl='https://instagram.fjnb12-1.fna.fbcdn.net/v/t51.2885-15/243385646_584565779558058_6508985384396360110_n.webp?stp=dst-jpg_e35&_nc_ht=instagram.fjnb12-1.fna.fbcdn.net&_nc_cat=106&_nc_ohc=nrtaOwxdg64AX8NQE-Z&edm=ABfd0MgBAAAA&ccb=7-4&oh=00_AT8JXpM2XKqA_d06LV10Qy_Jt1GYnvpjUEeVZZMRIdwgnQ&oe=621D237E&_nc_sid=7bff83', username='houston_2731', likes=1, comments=0, commentsDisabled=False, isVideo=False))

Now for this reason specifically this output cannot be inserted into the database because it creates a value error caused by the nested tuple because of its type. the database doesn't recognize its type and then fails. so now what I want to do is extract the nested tuple and use it as the main tuple. How do I go about doing that?

class insta():

    def instagram(self):

        dbname = '******'
        user = '******'
        password = '******'
        host = '******' 
        port = ****
        cur = None
        conn = None
        
        try:
            conn = psycopg2.connect(
                    dbname = dbname,
                    user = user,
                    password = password,
                    host = host, 
                    port = port   
            )
            
            cur = conn.cursor()

            cur.execute('DROP TABLE IF EXISTS Machine_instagram')

            create_table =  '''CREATE TABLE IF NOT EXISTS Machine_instagram (
                id   serial PRIMARY KEY,
                url  char,
                date timestamp,
                content char,
                thumbnailUrl char,
                displayUrl char,
                username char,
                likes int,
                comments int,
                commentsDisabled bool,
                isVideo bool)'''

             cur.execute(create_table)

                for b in enumerate(sninstagram.InstagramUserScraper(username='houston_2731').get_items()):
                         insert_insta = 'INSERT INTO Machine_instagram (url, date, content,thumbnailUrl, displayUrl, username, likes, comments, commentsDisabled, isVideo) VALUES (%s, %s, %s, %s,%s, %s, %s, %s, %s, %s)'
                         insert_values = [(b)]
                for records in insert_values:
                    cur.execute(insert_insta, records)
            
             conn.commit()
             print('completed')

        except Exception as error:
                print(error)  

        finally:
            if cur is not None:
                cur.close()
            if conn is not None:
                conn.close()

insta1 = insta()

insta1.instagram()

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文