为什么我的网站刮擦功能返回出乎意料的东西？

发布于 2025-01-23 21:06:01 字数 608 浏览 0 评论 0原文

我的目标：试图构建功能； def retireve_title（html）期望为输入，一串HTML并返回标题元素。

我已经进口美丽的小组来完成此任务。当我仍在学习时，任何指导都将受到赞赏。

我的尝试函数：

def retrieve_title(html):
    soup = [html]
    result = soup.title.text
    return(result)

使用功能：

html = '<title>Jack and the bean stalk</title><header>This is a story about x y z</header><p>talk to you later</p>'
print(get_title(html))

意外结果：

“ attributeError：'列表'对象没有属性'title'

预期结果：

“杰克和豆stal”

原文

My goal: Attempting to build a function; def retrieve_title(html) that expects as input, a string of html and returns the title element.

I've imported beautifulsoup to complete this task. Any guidance is appreciated as I'm still learning.

My attempted function:

def retrieve_title(html):
    soup = [html]
    result = soup.title.text
    return(result)

Using the function:

html = '<title>Jack and the bean stalk</title><header>This is a story about x y z</header><p>talk to you later</p>'
print(get_title(html))

Unexpected outcome:

"AttributeError: 'list' object has no attribute 'title'"

Expected outcome:

"Jack and the beanstalk"

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我不会写诗 2025-01-30 21:06:01

标签后立即进行文本节点

 html = '''
    <title>
     Jack and the beanstalk     
    </title>
    <header>
     This is a story about x y z
    </header>
    <p>
     Once upon a time
    </p>
    '''
    
    from bs4 import BeautifulSoup
    
    soup = BeautifulSoup(html,'html.parser')
    
    #print(soup.prettify())
    
    title=soup.title.find(text=True)
    print(title)

jack和bean stalk是标题 >输出：

 Jack and the beanstalk

Jack and the bean stalk is a text node immediate after title tag so to grab that you can apply .find(text=True)

 html = '''
    <title>
     Jack and the beanstalk     
    </title>
    <header>
     This is a story about x y z
    </header>
    <p>
     Once upon a time
    </p>
    '''
    
    from bs4 import BeautifulSoup
    
    soup = BeautifulSoup(html,'html.parser')
    
    #print(soup.prettify())
    
    title=soup.title.find(text=True)
    print(title)

Output: