如何在列表中分组直到找到某个元素?
会尽力解释这一点。
目前,我正在刮擦硒。我只是想从页面上获得和弦,所以我使用此代码:
for elem in driver.find_elements_by_xpath('.//span[@class = "_3PpPJ OrSDI"]'):
print (elem.text)
但这使它们在长列表中打印。
问题是我需要按照网站上的方式进行组织。
例如:
[ Verse 1 ]
G Em C G
[ Verse 2 ]
G Em C G
我不确定是否应该使用sufter-sibling
或什么。我的完整代码在这里bellow:
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
from selenium_stealth import stealth
from fake_useragent import UserAgent
from bs4 import BeautifulSoup as BS
import pandas as pd
import time
from collections import Counter
import undetected_chromedriver as uc
from selenium import webdriver
from cleantext import clean
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.support import expected_conditions as EC
options = Options()
#options.headless = True
options.add_argument("start-maximized")
options.add_argument('--no-sandbox')
driver = uc.Chrome(options=options)
url6 = 'https://tabs.ultimate-guitar.com/tab/olivia-rodrigo/drivers-license-chords-3504560'
driver.get(url6)
driver.implicitly_wait(30)
try:
verse_1 = driver.find_element_by_xpath("// span[contains(text(),\
'[Verse 1]')]").text
print(verse_1)
except NoSuchElementException:
time.sleep(1)
for elem in driver.find_elements_by_xpath('.//span[@class = "_3PpPJ OrSDI"]'):
print (elem.text)
verse_2 = driver.find_element_by_xpath("// span[contains(text(),\
'[Verse 2]')]").text
print(verse_2)
Going to try my best to explain this.
Currently I am scraping this web page using Selenium. I am just trying to get the chords from the page so I am using this code:
for elem in driver.find_elements_by_xpath('.//span[@class = "_3PpPJ OrSDI"]'):
print (elem.text)
but this makes them print in a long list.
The issue is I need them to be organized the way they are on the website.
For example:
[ Verse 1 ]
G Em C G
[ Verse 2 ]
G Em C G
I'm not sure if I should use following-sibling
or what. My full code is here bellow:
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
from selenium_stealth import stealth
from fake_useragent import UserAgent
from bs4 import BeautifulSoup as BS
import pandas as pd
import time
from collections import Counter
import undetected_chromedriver as uc
from selenium import webdriver
from cleantext import clean
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.support import expected_conditions as EC
options = Options()
#options.headless = True
options.add_argument("start-maximized")
options.add_argument('--no-sandbox')
driver = uc.Chrome(options=options)
url6 = 'https://tabs.ultimate-guitar.com/tab/olivia-rodrigo/drivers-license-chords-3504560'
driver.get(url6)
driver.implicitly_wait(30)
try:
verse_1 = driver.find_element_by_xpath("// span[contains(text(),\
'[Verse 1]')]").text
print(verse_1)
except NoSuchElementException:
time.sleep(1)
for elem in driver.find_elements_by_xpath('.//span[@class = "_3PpPJ OrSDI"]'):
print (elem.text)
verse_2 = driver.find_element_by_xpath("// span[contains(text(),\
'[Verse 2]')]").text
print(verse_2)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我更改了您的代码。这应该从开箱即用:
这是输出:
I changed your code a little. This should work out of the box:
This is the output: