尝试使用美丽的小组刮擦2个标签,然后将它们放在同一CSV中

发布于 2025-01-21 10:28:27 字数 4348 浏览 1 评论 0原文

我目前正在学习Python,并试图通过采用其他代码来进行自己的项目,因此在学习时不要责怪我。

我正在列出tickers.csv的股票清单,并刮了一个网站以获取领域&行业并将它们放在股票上。CSV

问题是我只能将行业或行业(通过选择一个行业)进入Stocks.csv,

if __name__ == '__main__':
        to_csv(list(map(lambda ticker: get_sector(ticker), get_stocks())))
        # to_csv(list(map(lambda ticker: get_industry(ticker), get_stocks())))

我想同时完成行业和行业 整个代码

# dependencies
import pandas as pd
import requests
from bs4 import BeautifulSoup as bs

LSE = 'https://csimarket.com/stocks/at_glance.php?code='


def get_stocks():
    df = pd.read_csv('watchlist/tickers.csv')
    return list(df['ticker'])


def to_csv(stocks):
    df = pd.DataFrame(stocks)
    df.to_csv('stocks.csv', index=False)


def get_soup(url):
    return bs(requests.get(url).text, 'html.parser')


def get_sector(ticker):
    soup = get_soup(LSE + ticker)
    try:
        sector = soup.find('span', text='Sector').find_next('a').text.replace('\n', '').replace('•', '').strip()
    except:
        print('No sector information availible for ', ticker)
        return {'ticker': ticker, 'sector': ''}

    print(ticker, sector)
    return {'ticker': ticker, 'sector': sector}


def get_industry(ticker):
    soup1 = get_soup(LSE + ticker)
    try:
        industry = soup1.find('span', text='Industry').find_next('a').text.replace('\n', '').replace('•', '').strip()
    except:
        print('No industry information availible for ', ticker)
        return {'ticker': ticker, 'industry': ''}

    print(ticker, industry)
    return {'ticker': ticker, 'industry': industry}


if __name__ == '__main__':
    to_csv(list(map(lambda ticker: get_sector(ticker), get_stocks())))
    # to_csv(list(map(lambda ticker: get_industry(ticker), get_stocks())))

这里的

ticker,
A
AA
AADI
AAIC
AAL
AAN
AAOI
AAON
AAP
AAPL
AAT
AAU
AAWW
AB
ABB
ABBV
ABC
ABCB
ABCL
ABEO
ABEV
ABG
ABIO
ABM
ABMD
ABNB
ABOS
ABR
ABSI
ABST
ABT
ABTX
ABUS
ACA
ACAD
ACB
ACC
ACCD
ACCO
ACEL
ACER
ACET
ACEV
ACGL
ACH
ACHC
ACHR
ACHV
ACI
ACIU

ticker,sector
A,Healthcare
AA,Basic Materials
AADI,
AAIC,Services
AAL,Transportation
AAN,Services
AAOI,Technology
AAON,Capital Goods
AAP,Retail
AAPL,Technology
AAT,Financial
AAU,Basic Materials
AAWW,Transportation
AB,Financial
ABB,Consumer Discretionary
ABBV,Healthcare
ABC,Retail
ABCB,Financial
ABCL,Healthcare
ABEO,Healthcare
ABEV,Consumer Non Cyclical
ABG,Retail
ABIO,Healthcare
ABM,Services
ABMD,Healthcare
ABNB,Services
ABOS,Healthcare
ABR,Financial
ABSI,Healthcare
ABST,
ABT,Healthcare
ABTX,Financial
ABUS,Healthcare
ACA,Basic Materials
ACAD,Healthcare
ACB,
ACC,Financial
ACCD,Financial
ACCO,Basic Materials
ACEL,Services
ACER,Healthcare
ACET,Retail
ACEV,Technology
ACGL,Financial
ACH,Basic Materials
ACHC,Healthcare
ACHR,Capital Goods
ACHV,Healthcare
ACI,Energy
ACIU,

这是

ticker,industry
A,Laboratory Analytical Instruments
AA,Aluminum
AADI,
AAIC,Real Estate Operations
AAL,Airline
AAN,Rental & Leasing
AAOI,Computer Networks
AAON,Industrial Machinery and Components
AAP,Automotive Aftermarket
AAPL,Computer Hardware
AAT,Real Estate Investment Trusts
AAU,Metal Mining
AAWW,Special Transportation Services
AB,Investment Services
ABB,Electric & Wiring Equipment
ABBV,Biotechnology & Pharmaceuticals
ABC,Pharmacy Services & Retail Drugstore
ABCB,Regional Banks
ABCL,Major Pharmaceutical Preparations
ABEO,Major Pharmaceutical Preparations
ABEV,Nonalcoholic Beverages
ABG,Automotive Aftermarket
ABIO,In Vitro & In Vivo Diagnostic Substances
ABM,Professional Services
ABMD,Medical Equipment & Supplies
ABNB,Real Estate Operations
ABOS,Biotechnology & Pharmaceuticals
ABR,Real Estate Investment Trusts
ABSI,Medical Laboratories
ABST,
ABT,Major Pharmaceutical Preparations
ABTX,Commercial Banks
ABUS,Major Pharmaceutical Preparations
ACA,Miscellaneous Fabricated Products
ACAD,Major Pharmaceutical Preparations
ACB,
ACC,Real Estate Investment Trusts
ACCD,Blank Checks
ACCO,Paper & Paper Products
ACEL,Casinos & Gaming
ACER,Major Pharmaceutical Preparations
ACET,Pharmacy Services & Retail Drugstore
ACEV,Semiconductors
ACGL,Property & Casualty Insurance
ACH,Aluminum
ACHC,Healthcare Facilities
ACHR,Aerospace & Defense
ACHV,In Vitro & In Vivo Diagnostic Substances
ACI,Coal Mining
ACIU,

Im learning python currently and trying to do my own projects by taking pieces of other codes so don't fault me while I'm learning.

Im taking a list of stocks from tickers.csv and scraped a website to get sector & industry and place them on a stocks.csv

the problem is I can only get either the sector or industry (by choosing one) into the stocks.csv by

if __name__ == '__main__':
        to_csv(list(map(lambda ticker: get_sector(ticker), get_stocks())))
        # to_csv(list(map(lambda ticker: get_industry(ticker), get_stocks())))

I would like to get both sector and industry done at the same time
here is the whole code

# dependencies
import pandas as pd
import requests
from bs4 import BeautifulSoup as bs

LSE = 'https://csimarket.com/stocks/at_glance.php?code='


def get_stocks():
    df = pd.read_csv('watchlist/tickers.csv')
    return list(df['ticker'])


def to_csv(stocks):
    df = pd.DataFrame(stocks)
    df.to_csv('stocks.csv', index=False)


def get_soup(url):
    return bs(requests.get(url).text, 'html.parser')


def get_sector(ticker):
    soup = get_soup(LSE + ticker)
    try:
        sector = soup.find('span', text='Sector').find_next('a').text.replace('\n', '').replace('•', '').strip()
    except:
        print('No sector information availible for ', ticker)
        return {'ticker': ticker, 'sector': ''}

    print(ticker, sector)
    return {'ticker': ticker, 'sector': sector}


def get_industry(ticker):
    soup1 = get_soup(LSE + ticker)
    try:
        industry = soup1.find('span', text='Industry').find_next('a').text.replace('\n', '').replace('•', '').strip()
    except:
        print('No industry information availible for ', ticker)
        return {'ticker': ticker, 'industry': ''}

    print(ticker, industry)
    return {'ticker': ticker, 'industry': industry}


if __name__ == '__main__':
    to_csv(list(map(lambda ticker: get_sector(ticker), get_stocks())))
    # to_csv(list(map(lambda ticker: get_industry(ticker), get_stocks())))

here is the tickers.csv

ticker,
A
AA
AADI
AAIC
AAL
AAN
AAOI
AAON
AAP
AAPL
AAT
AAU
AAWW
AB
ABB
ABBV
ABC
ABCB
ABCL
ABEO
ABEV
ABG
ABIO
ABM
ABMD
ABNB
ABOS
ABR
ABSI
ABST
ABT
ABTX
ABUS
ACA
ACAD
ACB
ACC
ACCD
ACCO
ACEL
ACER
ACET
ACEV
ACGL
ACH
ACHC
ACHR
ACHV
ACI
ACIU

here is the stocks.csv when I get the sectors

ticker,sector
A,Healthcare
AA,Basic Materials
AADI,
AAIC,Services
AAL,Transportation
AAN,Services
AAOI,Technology
AAON,Capital Goods
AAP,Retail
AAPL,Technology
AAT,Financial
AAU,Basic Materials
AAWW,Transportation
AB,Financial
ABB,Consumer Discretionary
ABBV,Healthcare
ABC,Retail
ABCB,Financial
ABCL,Healthcare
ABEO,Healthcare
ABEV,Consumer Non Cyclical
ABG,Retail
ABIO,Healthcare
ABM,Services
ABMD,Healthcare
ABNB,Services
ABOS,Healthcare
ABR,Financial
ABSI,Healthcare
ABST,
ABT,Healthcare
ABTX,Financial
ABUS,Healthcare
ACA,Basic Materials
ACAD,Healthcare
ACB,
ACC,Financial
ACCD,Financial
ACCO,Basic Materials
ACEL,Services
ACER,Healthcare
ACET,Retail
ACEV,Technology
ACGL,Financial
ACH,Basic Materials
ACHC,Healthcare
ACHR,Capital Goods
ACHV,Healthcare
ACI,Energy
ACIU,

here is the stocks.csv when I get the industries

ticker,industry
A,Laboratory Analytical Instruments
AA,Aluminum
AADI,
AAIC,Real Estate Operations
AAL,Airline
AAN,Rental & Leasing
AAOI,Computer Networks
AAON,Industrial Machinery and Components
AAP,Automotive Aftermarket
AAPL,Computer Hardware
AAT,Real Estate Investment Trusts
AAU,Metal Mining
AAWW,Special Transportation Services
AB,Investment Services
ABB,Electric & Wiring Equipment
ABBV,Biotechnology & Pharmaceuticals
ABC,Pharmacy Services & Retail Drugstore
ABCB,Regional Banks
ABCL,Major Pharmaceutical Preparations
ABEO,Major Pharmaceutical Preparations
ABEV,Nonalcoholic Beverages
ABG,Automotive Aftermarket
ABIO,In Vitro & In Vivo Diagnostic Substances
ABM,Professional Services
ABMD,Medical Equipment & Supplies
ABNB,Real Estate Operations
ABOS,Biotechnology & Pharmaceuticals
ABR,Real Estate Investment Trusts
ABSI,Medical Laboratories
ABST,
ABT,Major Pharmaceutical Preparations
ABTX,Commercial Banks
ABUS,Major Pharmaceutical Preparations
ACA,Miscellaneous Fabricated Products
ACAD,Major Pharmaceutical Preparations
ACB,
ACC,Real Estate Investment Trusts
ACCD,Blank Checks
ACCO,Paper & Paper Products
ACEL,Casinos & Gaming
ACER,Major Pharmaceutical Preparations
ACET,Pharmacy Services & Retail Drugstore
ACEV,Semiconductors
ACGL,Property & Casualty Insurance
ACH,Aluminum
ACHC,Healthcare Facilities
ACHR,Aerospace & Defense
ACHV,In Vitro & In Vivo Diagnostic Substances
ACI,Coal Mining
ACIU,

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

明月松间行 2025-01-28 10:28:27

只需将您现有的两个功能结合到一个功能中,然后通过单个汤对象解析结果。

import pandas as pd
import requests
from bs4 import BeautifulSoup as bs

LSE = 'https://csimarket.com/stocks/at_glance.php?code='

def get_stocks():
    df = pd.read_csv('watchlist/tickers.csv')
    return list(df['ticker'])


def to_csv(stocks):
    df = pd.DataFrame(stocks)
    df.to_csv('stocks.csv', encoding='utf-8-sig', index=False)


def get_soup(url):
    return bs(requests.get(url, headers = {'User-Agent':'Mozilla/5.0'}).text, 'html.parser')


def get_data(ticker):
    soup = get_soup(LSE + ticker)
    try:
        sector = soup.find('span', text='Sector').find_next('a').text.replace('\n', '').replace('•', '').strip()
    except:
        print('No sector information availible for ', ticker)
        return {'ticker': ticker, 'sector': ''}

    print(ticker, sector)
    
    try:
        industry = soup.find('span', text='Industry').find_next('a').text.replace('\n', '').replace('•', '').strip()
    except:
        print('No industry information availible for ', ticker)
        return {'ticker': ticker, 'industry': ''}

    print(ticker, industry)
    return {'ticker': ticker, 'sector': sector,  'industry': industry}

if __name__ == '__main__':
    to_csv(list(map(lambda ticker: get_data(ticker), get_stocks())))

Just combine your existing two functions into one and return the result from parsing via a single soup object

import pandas as pd
import requests
from bs4 import BeautifulSoup as bs

LSE = 'https://csimarket.com/stocks/at_glance.php?code='

def get_stocks():
    df = pd.read_csv('watchlist/tickers.csv')
    return list(df['ticker'])


def to_csv(stocks):
    df = pd.DataFrame(stocks)
    df.to_csv('stocks.csv', encoding='utf-8-sig', index=False)


def get_soup(url):
    return bs(requests.get(url, headers = {'User-Agent':'Mozilla/5.0'}).text, 'html.parser')


def get_data(ticker):
    soup = get_soup(LSE + ticker)
    try:
        sector = soup.find('span', text='Sector').find_next('a').text.replace('\n', '').replace('•', '').strip()
    except:
        print('No sector information availible for ', ticker)
        return {'ticker': ticker, 'sector': ''}

    print(ticker, sector)
    
    try:
        industry = soup.find('span', text='Industry').find_next('a').text.replace('\n', '').replace('•', '').strip()
    except:
        print('No industry information availible for ', ticker)
        return {'ticker': ticker, 'industry': ''}

    print(ticker, industry)
    return {'ticker': ticker, 'sector': sector,  'industry': industry}

if __name__ == '__main__':
    to_csv(list(map(lambda ticker: get_data(ticker), get_stocks())))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文