微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

python-将BeautifulSoup元素解析为Selenium

我想使用硒获取网站的源代码;使用BeautifulSoup查找特定元素;然后将其解析为selenium.selenium.webdriver.remote.webelement对象.
像这样:

driver.get("www.google.com")
soup = BeautifulSoup(driver.source)
element = soup.find(title="Search")

element = Selenium.webelement(element)
element.click()

我该如何实现?

解决方法:

我有用的一种通用解决方案是计算the xpath of the bs4 element,然后用它来查找硒中的元素,

xpath = xpath_soup(soup_element)
selenium_element = driver.find_element_by_xpath(xpath)

import itertools

def xpath_soup(element):
    """
    Generate xpath of soup element
    :param element: bs4 text or node
    :return: xpath as string
    """
    components = []
    child = element if element.name else element.parent
    for parent in child.parents:
        """
        @type parent: bs4.element.Tag
        """
        prevIoUs = itertools.islice(parent.children, 0, parent.contents.index(child))
        xpath_tag = child.name
        xpath_index = sum(1 for i in prevIoUs if i.name == xpath_tag) + 1
        components.append(xpath_tag if xpath_index == 1 else '%s[%d]' % (xpath_tag, xpath_index))
        child = parent
    components.reverse()
    return '/%s' % '/'.join(components)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。

相关推荐