*************************
Selenium
*************************

:selenium官方文档: https://www.selenium.dev/documentation/en/
:selenium docs文档: https://selenium-python.readthedocs.io/
:docs2: https://www.selenium.dev/selenium/docs/api/py/
:docs中文版: https://selenium-python-zh.readthedocs.io/en/latest/
:驱动下载: https://www.selenium.dev/documentation/en/webdriver/driver_requirements/#quick-reference
:知乎: https://zhuanlan.zhihu.com/p/363313659?
:github: https://github.com/seleniumhq/selenium
:docker安装: https://github.com/SeleniumHQ/docker-selenium
:JSON wire protocol: https://www.selenium.dev/documentation/legacy/json_wire_protocol/

.. toctree::
   :maxdepth: 2
   :caption: 相关阅读

    Webdriver工作原理 <http://t.zoukankan.com/clarke157-p-8400610.html>
    崔庆才 <https://cuiqingcai.com/>
    使用POM技术将页面代码和测试代码分离 <https://zhuanlan.zhihu.com/p/395168349>

驱动需要设置环境变量，windows建议将驱动统一放在 ``WebDriver\bin\`` 目录下

.. _selenium: http://github.com/SeleniumHQ/selenium/

.. note::

    docker安装chrome时，镜像体积要求bullseye起, 不要想着多阶段构建缩减体积(缺失很多依赖链接库)

使用缺点: 对于Vue、React等热门前端框架动态生成的网页内容，定位元素较困难

在无头linux服务器使用selenium
=========================================

日后制作成ansible剧本

安装chrome
-------------------------------------

1. 下载chrome

到官网下载linux版本的chrome, 上传到服务器

.. tip::

    官网拿不到链接, 且指定不了版本

2. 服务器执行

.. code-block:: console

    dpkg -i google-chrome-stable_current_amd64.deb

如果依赖报错

.. code-block:: console

    apt-get install -f

查看版本

.. code-block:: console

    root@VM-12-10-ubuntu:/home/ubuntu/chromedriver-linux64# google-chrome --version
    Google Chrome 116.0.5845.96

安装webdriver
-------------------------------

selenium4.6+版本自动下载webdriver，但担心网络问题，还是自己下载锁定版本

下载webdriver: https://chromedriver.chromium.org/downloads

但chrome官网只能下载最新的，例如116,但webdriver最高只有114

去另外一个网址下载116的webdriver: https://googlechromelabs.github.io/chrome-for-testing/

设置环境变量: https://www.selenium.dev/documentation/webdriver/troubleshooting/errors/driver_location/#use-the-path-environment-variable

代码验证
------------------------

.. code-block:: python

    from selenium import webdriver
    options = webdriver.ChromeOptions()
    # root权限启动时要加上的参数
    options.add_argument("--no-sandbox")
    options.add_argument("--headless")
    service = webdriver.ChromeService(executable_path='/home/ubuntu/chromedriver-linux64/chromedriver')
    driver = webdriver.Chrome(options=options, service=service)
    driver.get("https://www.baidu.com")
    driver.title

wsl解决chrome乱码问题
--------------------------------------

在浏览器设置增加语言就可以解决chrome乱码问题了

等待
======================================

有强制等待、显式等待和隐式等待。一般是强制等待和显式等待搭配使用，隐式等待使用较少

.. warning::

    官方说显式等待和隐式等待不要同时使用

显式等待
--------------------------------------

+-----------------------------------------+-------------------------------+
| 方法                                    | 期待项                        |
+=========================================+===============================+
| expected_conditions.alert_is_present    | 期待出现警告框                |
+-----------------------------------------+-------------------------------+
| element_to_be_clickable(locator)        | 期望元素可视并且可以点击      |
+-----------------------------------------+-------------------------------+
| invisibility_of_element_located(locator)| 期望元素不可视并未在DOM出现   |
+-----------------------------------------+-------------------------------+

等待元素可见并点击
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

    ele = WebDriverWait(self.driver, 30).until(
            EC.visibility_of_element_located((By.TAG_NAME, 'td'))
          )
    ele.click()

等待同一节点下的第二个元素可见
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

    WebDriverWait(driver, 5).until(
            lambda x: x.find_elements(By.TAG_NAME, 'img')[1].is_displayed() is True
    )

等待加载元素消失
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

假设前端是使用element ui框架，加载元素使用的是 `class="el-loading-mask"`

.. code-block:: python

    # 收集当前页面的所有等待元素
    waits = driver.find_elements_by_class_name('el-loading-mask')
    # 显式等待使用的方法，所有wait元素任意一个出现在页面显式loading时返回True，否则返回False
    method = any(list(map(lambda ele: ele.is_displayed(), waits)))

    # 前置动作，点击刷新按钮触发加载等待页面
    driver.find_element_by_xpath('//span[text()="Refresh"]').click()

    # 确保loading遮罩层先出现，再等待消失
    WebDriverWait(self.driver, 30).until(lambda driver: method)
    WebDriverWait(self.driver, 30).until_not(lambda driver: method)

隐式等待
-------------------

第二种类型的等待与显式等待不同，称为隐式等待。通过隐式等待，WebDriver在尝试寻找任何元素时轮询DOM一段时间。当网页上的某些元素不能立即使用，需要一些时间来加载时，这很有用。
隐式等待元素出现在默认情况下是禁用的，需要在每个会话的基础上手动启用。混合显式等待和隐式等待将导致意想不到的结果，即等待休眠的最大时间，即使元素是可用的或条件为真。
警告:不要混合隐式和显式等待。这样做会导致不可预测的等待时间。例如，设置10秒的隐式等待和15秒的显式等待可能会导致20秒后出现超时。
隐式等待是告诉WebDriver在寻找一个或多个不能立即使用的元素时轮询DOM一段时间。默认设置为0，表示禁用。一旦设置，隐式等待就设置为会话的生命周期。

.. code-block:: python

    from selenium.webdriver.common.by import By
    driver = Firefox()
    driver.implicitly_wait(10)
    driver.get("http://somedomain/url_that_delays_loading")
    my_dynamic_element = driver.find_element(By.ID, "myDynamicElement")

操作滚动条
==========================================================

方法一, 执行js语句(推荐)

执行js语句, 自动移动滚动条直至元素可见::

    div = driver.find_element_by_class_name('classname')
    driver.execute_script('arguments[0].scrollIntoView()', div)

执行js语句，移动滚动条至顶端::

    # 横向滚动条，向右移动至顶端
    driver.execute_script('arguments[0].scrollLeft=arguments[0].scrollLeftMax', div)

方法二, 通过webdriver操作浏览器发送下箭头热键::

    body = driver.find_element(By.CSS_SELECTOR, 'body')
    body.click()  # 必须操作，激活body元素
    time.sleep(1)
    body.send_keys(Keys.DOWN)
    time.sleep(1)
    body.send_keys(Keys.DOWN)

下拉框操作
==========================================================

.. code-block:: python

    from selenium.webdriver.support.select import Select
    ele = driver.find_element_by_name("${select-name}")
    Select(ele).select_by_index(1)

切换iframe&frame
==========================================================

.. code-block:: python

    driver.switch_to.frame('frame_name')
    driver.switch_to.frame(1)
    driver.switch_to.frame(driver.find_elements(By.TAG_NAME, "iframe")[0])
    WebDriverWait(driver, 5).until(EC.frame_to_be_available_and_switch_to_it((By.TAG_NAME, 'iframe')))

切换回父frame::

    driver.switch_to.parent_frame()
    driver.switch_to.default_content()

弹出对话框的处理
=====================================

点击确认

.. code-block:: python

    driver.switch_to.alert.accept()

-----------------------------------------------------------

常见问题
===========================================================

如何对有readonly属性的input标签执行send_keys方法
-----------------------------------------------------------

移除readonly属性::

    ele = driver.find_element_by_tag_name('input')
    driver.execute_script("arguments[0].removeAttribute('readonly');", ele)

现在可以有效执行send_keys方法了

如何触发click事件？
------------------------------------------

假设点击body触发一个onclick事件

方法一 click方法::

    driver.find_element_by_tag_name('body').click()

方法二 执行js脚本::

    ele = driver.find_element_by_tag_name('body')
    driver.execute_script("arguments[0].click();", ele)

.. note::

    推荐使用方法二，通过 arguments 对象执行click()方法，性能更好!

模拟移动端
------------------------

模拟手机端操作网页应该使用selendroid库，但目前最新版的android sdk跟selendroid不兼容，等兼容问题解决后了再使用

方法一：伪造user-agent

参考:https://www.cnpython.com/qa/122284

.. code-block:: python

    from selenium import webdriver
    from selenium.webdriver.chrome.options import Options

    chrome_options = Options()
    chrome_options.add_argument(' user-agent=Mozilla/5.0 (iPhone; CPU iPhone OS 10_3 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) CriOS/56.0.2924.75 Mobile/14E5239e Safari/602.1')
    driver = webdriver.Chrome(options=chrome_options)
    driver.get('https://www.baidu.com')

方法二, 参考: https://blog.csdn.net/minzhung/article/details/102964125

.. code-block:: python

    from selenium import webdriver
    options = webdriver.ChromeOptions()
    # deviceName值可在Chrome开发者工具查看，点击模拟手机端，可以选择不同的设备
    mobileEmulation = {'deviceName': 'iPhone X'}
    options.add_experimental_option('mobileEmulation', mobileEmulation)

    driver = webdriver.Chrome(chrome_options=options)

selenium.common.exceptions.ElementClickInterceptedException
----------------------------------------------------------------------

点击位置被覆盖从而点击错误。参考资料: https://blog.csdn.net/weixin_44321116/article/details/105118565.

出现该报错的例子:

导入文件，界面出现loading遮罩层，等待后点击load按钮。

.. code-block:: python

    driver.find_element_by_xpath("//span[contains(text(), 'load')]").click()

如果loading没加载完就点击load按钮，就会报ElementClickInterceptedException，因为loading也符合该xpath定位条件。


**解决办法: ActionChains**

.. code-block:: python

    text = driver.find_element_by_xpath(locator)
    text.click()

更改为

.. code-block:: python

    from selenium.common.exceptions import ElementClickInterceptedException
    from selenium.webdriver import ActionChains

    text = driver.find_element_by_xpath(locator)
    try:
        text.click()
    except ElementClickInterceptedException:
        ActionChains(driver).move_to_element(text).click(text).perform()


关于ActionChains更多的信息: https://blog.csdn.net/huilan_same/article/details/52305176

获取不了text值
-----------------------

这是因为页面文本值不可见，需要滑动滚动条

**解决办法：设置可见性**

.. code-block:: python

    for th in driver.find_elements_by_tag_name('th'):
        if not th.is_displayed():
            # 该语句作用是移动滑动条直至改元素可见
            driver.execute_script("arguments[0].scrollIntoView();", th)
        print(th.text)


源码系列
===============================

工程结构

.. code-block:: text

    selenium/
    common/exceptions.py '所有在webdriver代码可能发生的异常
    webdriver/android/ '安卓Webdriver
             /chrome/  '谷歌Webdriver
             /firefox/ '火狐Webdriver
             /common/by.py '定位器策略集合
             /support/expected_conditions.py '期望表达式定义
             /support/wait.py '显式等待WebDriverWait

webdriver工作原理
-------------------------------------------------------------------

selenium调用webdriver提供的api来驱动网站去自动做一些事情。

selenium跟webdriver交互的class: selenium.webdriver.remote.remote_connection.py::RemoteConnection

selenium使用Popen启动webdriver(``selenium.webdriver.common.service::Service::start``)

.. tip::

    启动debug和日志看，可以知道port每次都不一样，这里selenium是通过工具函数(``selenium.webdriver.common.utils::free_port``)获取一个空闲的端口号

    .. code-block:: python

        def free_port() -> int:
            """
            Determines a free port using sockets.
            """
            free_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            free_socket.bind(('127.0.0.1', 0))
            free_socket.listen(5)
            port: int = free_socket.getsockname()[1]
            free_socket.close()
            return port

知道了这一原理，我们可用尝试跳过selenium,发送自己构建的http请求跟webdriver交互(selenium是使用urllib3发送http请求的)