AJAX搜索和刮擦巴士页面//也许是硒？

发布于 2025-01-17 10:57:10 字数 4595 浏览 0 评论 0原文

我试图在巴士页面上获取路线价格导入请求来自BS4进口美丽的小组导入re

popup_linkz= list()
p=range(1, 2, 1)
for i in p:
                 
               
                    
                   
                
                def get_headers(session):
                    res = session.get("https://new.turbus.cl/turbuscl/inicio-compra")
                    if res.status_code == 200:
                        print("Got headers")
                        return res.text
                        
                    else:
                        print("Failed to get headers")
                        
                
                
                def search(session):
                    data = {
                            'origenInputModal': 'Santiago',
                            'destinoInputModal':'Calama',
                            'fechaRegreso': '03-04-2021',
                            'fechaIda': '31-03-2021',
                            
                        }
                    
                    res = session.post(
                        "https://new.turbus.cl/turbuscl/seleccion-itinerario", 
                        data=data) #not sure if this is the search link
                    if res.status_code == 200:
                        print("Search succeeded")
                        return res.text
                    else:
                        print("Search failed with error:", res.reason)
                    print(res.text)    
                 
                def get_popup_link(html):
                    soup = BeautifulSoup(html, "html.parser")
                    
                    
                    for t in soup.find_all('div', {'class': 'ticket_price-value'}):
                        precio = t.find('[class$="ticket_price-value"]').text
                        #cantidad = t.select_one('[id$="lblCantidad"]').text
                        #descripction = t.select_one('[id$="lblDescripcion"]').text
                        print(f"{precio=} {precio=}")
                    
                        #print()                
                        return precio
           
                def main():
                    with requests.Session() as s:
                        get_headers(s)
                        html = search(s)
                        popup_links = (get_popup_link(html))
                        print(popup_links)
                       # popup_linkz.extend(popup_links)
                        #print(popup_links)
                        #print(popup_linkz)
                        #download_html = get_download_html(s, popup_links)
                        # print(download_html)
                        #popup_linkz.extend(popup_links for i in range(0, 1, 1))
                main()
                
#a = popup_linkz
#print(a)

    enter code here

this是链接 https://new.turbus.cl/turbus.cl/turbus/turbuscl/inicio-compra < /a>

现在，我可以找到搜索的输入框，但不确定是否要运行它。

我会收到此错误value eRror：太多值无法打开包装（预期2），

所以我不确定自己的失败。

您会试图启发我以取得成功吗？

我一直在尝试所有死亡，并使用硒的新方法来进行搜索。...

是对的，我在做什么还是更好的方法？

- - 编码：UTF-8 - -

“”“” 上创建

在3月29日星期二21:04:05 2022 @author：Christian Marcos “““

# -*- coding: utf-8 -*-
"""
Created on Tue Mar 29 16:20:40 2022

@author: christian marcos
"""

from selenium import webdriver as wd
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup as bs
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys
from pandas.io.html import read_html
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys



#select and fill firs field origin
driver=wd.Chrome('C:\\chromedriver.exe')
driver.maximize_window()
driver.get('https://new.turbus.cl/turbuscl/inicio-compra')

driver.implicitly_wait(20)
driver.find_element_by_xpath('//*[@id="origen"]').click();
wait = WebDriverWait(driver, 30)

#select and fill firs field 

driver.implicitly_wait(10)
driver.find_element_by_xpath('//*[@id="modalOriginCity"]/div/div/div[2]/div[2]/ul/li[1]').click();

此致，

原文

I am trying to get prices of routes on a bus page
import requests
from bs4 import BeautifulSoup
import re

popup_linkz= list()
p=range(1, 2, 1)
for i in p:
                 
               
                    
                   
                
                def get_headers(session):
                    res = session.get("https://new.turbus.cl/turbuscl/inicio-compra")
                    if res.status_code == 200:
                        print("Got headers")
                        return res.text
                        
                    else:
                        print("Failed to get headers")
                        
                
                
                def search(session):
                    data = {
                            'origenInputModal': 'Santiago',
                            'destinoInputModal':'Calama',
                            'fechaRegreso': '03-04-2021',
                            'fechaIda': '31-03-2021',
                            
                        }
                    
                    res = session.post(
                        "https://new.turbus.cl/turbuscl/seleccion-itinerario", 
                        data=data) #not sure if this is the search link
                    if res.status_code == 200:
                        print("Search succeeded")
                        return res.text
                    else:
                        print("Search failed with error:", res.reason)
                    print(res.text)    
                 
                def get_popup_link(html):
                    soup = BeautifulSoup(html, "html.parser")
                    
                    
                    for t in soup.find_all('div', {'class': 'ticket_price-value'}):
                        precio = t.find('[class$="ticket_price-value"]').text
                        #cantidad = t.select_one('[id$="lblCantidad"]').text
                        #descripction = t.select_one('[id$="lblDescripcion"]').text
                        print(f"{precio=} {precio=}")
                    
                        #print()                
                        return precio
           
                def main():
                    with requests.Session() as s:
                        get_headers(s)
                        html = search(s)
                        popup_links = (get_popup_link(html))
                        print(popup_links)
                       # popup_linkz.extend(popup_links)
                        #print(popup_links)
                        #print(popup_linkz)
                        #download_html = get_download_html(s, popup_links)
                        # print(download_html)
                        #popup_linkz.extend(popup_links for i in range(0, 1, 1))
                main()
                
#a = popup_linkz
#print(a)

    enter code here

this is the link https://new.turbus.cl/turbuscl/inicio-compra

So right now I am able to find the input boxes of the search, but not sure were to run it.

I am getting this error ValueError: too many values to unpack (expected 2)

so i am not so sure of what i am failing.

would you try to enlight me in order to succeed?

I have been trying all die and get a new approach with selenium in order to get search....

is right what i am doing or was better my first approach?

-- coding: utf-8 --

"""
Created on Tue Mar 29 21:04:05 2022

@author: christian marcos
"""

# -*- coding: utf-8 -*-
"""
Created on Tue Mar 29 16:20:40 2022

@author: christian marcos
"""

from selenium import webdriver as wd
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup as bs
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys
from pandas.io.html import read_html
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys



#select and fill firs field origin
driver=wd.Chrome('C:\\chromedriver.exe')
driver.maximize_window()
driver.get('https://new.turbus.cl/turbuscl/inicio-compra')

driver.implicitly_wait(20)
driver.find_element_by_xpath('//*[@id="origen"]').click();
wait = WebDriverWait(driver, 30)

#select and fill firs field 

driver.implicitly_wait(10)
driver.find_element_by_xpath('//*[@id="modalOriginCity"]/div/div/div[2]/div[2]/ul/li[1]').click();

Best regards,

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦明 2025-01-24 10:57:10

所需的帖子数据是不同的。在这种情况下，您需要：

{
  "fechaSalidaTramo": "31/03/2022",
  "mnemotecnicoCiudadOrigenTramo": "stgo",
  "mnemotecnicoCiudadDestinoTramo": "aric",
  "horaSalidaTramo": 0,
  "horaSalidaTramoMaxima": 0,
  "codigoLinea": 90,
  "numeroViaje": 0,
  "numeroCuentaCorrienteCliente": 0,
  "codigoIdaRegreso": 1,
  "cantidadAsientos": 1,
  "numeroRegistros": 0
}

链接为https://new.turbus.cl/turbuscl/recursoss/vtwst76/web1。

在Python中，看起来会这样：

import requests

HOST = "https://nclt.gov.in/"

LINK = "https://new.turbus.cl/turbuscl/recursos/vtwst76/web1"

DATA = '{"fechaSalidaTramo":"31/03/2022","mnemotecnicoCiudadOrigenTramo":"stgo","mnemotecnicoCiudadDestinoTramo":"aric","horaSalidaTramo":0,"horaSalidaTramoMaxima":0,"codigoLinea":90,"numeroViaje":0,"numeroCuentaCorrienteCliente":0,"codigoIdaRegreso":1,"cantidadAsientos":1,"numeroRegistros":0}'


HEADERS = {
    "Content-Type": "application/json",
}

def get_route(origin, destination):
    res = requests.post(LINK, data=DATA, headers=HEADERS)
    if res.status_code == 200:
        print("getting routes")
        return res.json()
    else:
        print(res)


def main():
    info = get_route("here", "there")
    print(info)

if __name__ == "__main__":
    main()

我如何找到答案：

转到网站。
打开网络选项卡，以便我可以看到请求。
进行搜索，并找到匹配的请求。
将请求复制为卷曲请求，并将其导入邮递员。
卸下标题，并在执行请求时查看是否会出现错误。重复直到只有所需的标题。
复制所需的标题和数据，并使用请求对其进行测试。

The post data needed is different. In this case, you need:

{
  "fechaSalidaTramo": "31/03/2022",
  "mnemotecnicoCiudadOrigenTramo": "stgo",
  "mnemotecnicoCiudadDestinoTramo": "aric",
  "horaSalidaTramo": 0,
  "horaSalidaTramoMaxima": 0,
  "codigoLinea": 90,
  "numeroViaje": 0,
  "numeroCuentaCorrienteCliente": 0,
  "codigoIdaRegreso": 1,
  "cantidadAsientos": 1,
  "numeroRegistros": 0
}

And the link is, https://new.turbus.cl/turbuscl/recursos/vtwst76/web1.

In python, it'll look like this:

import requests

HOST = "https://nclt.gov.in/"

LINK = "https://new.turbus.cl/turbuscl/recursos/vtwst76/web1"

DATA = '{"fechaSalidaTramo":"31/03/2022","mnemotecnicoCiudadOrigenTramo":"stgo","mnemotecnicoCiudadDestinoTramo":"aric","horaSalidaTramo":0,"horaSalidaTramoMaxima":0,"codigoLinea":90,"numeroViaje":0,"numeroCuentaCorrienteCliente":0,"codigoIdaRegreso":1,"cantidadAsientos":1,"numeroRegistros":0}'


HEADERS = {
    "Content-Type": "application/json",
}

def get_route(origin, destination):
    res = requests.post(LINK, data=DATA, headers=HEADERS)
    if res.status_code == 200:
        print("getting routes")
        return res.json()
    else:
        print(res)


def main():
    info = get_route("here", "there")
    print(info)

if __name__ == "__main__":
    main()

How I got to the answer:

Go to the site.
Open the network tab, so I can see requests.
Do a search, and find the request that matches.
Copy the request as a curl request and import it into postman.
Remove headers, and see if you get an error when you do a request. Repeat until you have only the needed headers.
Copy the needed headers and data, and test it using requests.

回复收藏 0 原文

~没有更多了~