ansible unarchive 模块无法上传并解压压缩包问题处理

ansible unarchive 模块无法上传并解压压缩包问题处理

使用 unarchive 模块时出现如下错误

Unexpected error when accessing exploded file: [Errno 2] No such file or directory

环境描述

使用 ansible-playbook 上传安装包时出现 Unexpected error when accessing exploded file: [Errno 2] No such file or directory 错误提示,同样的脚本在 CentOS 7Ubuntu 14.04Ubuntu 16.04Ubuntu 20.04 上均正常执行且能得到预期的结果,但是在国产操作系统 Kylin 4.0.2 上则不行(x86_64 与 aarch64 上出现同样错误)


  • ansible-core 版本:2.12.1

  • ansible 版本:5.2.0

  • python 版本:3.8.12

  • 执行脚本操作系统:Manjaro Linux(Linux akiya-laptop 5.15.12-1-MANJARO #1 SMP PREEMPT Wed Dec 29 18:08:07 UTC 2021 x86_64 GNU/Linux)

  • 目标服务器:银河麒麟 Kylin 4.0.2(Linux Kylin 4.4.58-20171113.kylin.5.all-generic #5 SMP Fri Nov 17 14:38:03 CST 2017 x86_64 x86_64 x86_64 GNU/Linux)

  • playbook 代码片段

    main.yaml
    1
    2
    3
    4
    5
    6
    7
    8
    - name: "Unarchive {{ (ansible_architecture, docker_package) | path_join }} into {{ docker_deploy_path }}"
    ansible.builtin.unarchive:
    src: "{{ (ansible_architecture, docker_package) | path_join }}"
    dest: "{{ docker_deploy_path }}"
    owner: root
    group: root
    extra_opts:
    - --strip-components=1

问题排查

最开始怀疑是否是权限问题导致的,尝试了取消 ownergroup 运行 playbook

main.yaml
1
2
3
4
5
6
- name: "Unarchive {{ (ansible_architecture, docker_package) | path_join }} into {{ docker_deploy_path }}"
ansible.builtin.unarchive:
src: "{{ (ansible_architecture, docker_package) | path_join }}"
dest: "{{ docker_deploy_path }}"
extra_opts:
- --strip-components=1

结果如下:

ansible-playbook
1
2
TASK [docker : Unarchive x86_64/docker-20.10.8.tgz into /usr/local/bin] ************************************************************************************************************
ok: [test-server]

目标服务器上查看目录结果如下:

shell
1
2
3
4
5
# ls -l /usr/local/bin/{docker*,containerd*,ctr,runc}
ls: 无法访问'/usr/local/bin/docker*': 没有那个文件或目录
ls: 无法访问'/usr/local/bin/containerd*': 没有那个文件或目录
ls: 无法访问'/usr/local/bin/ctr': 没有那个文件或目录
ls: 无法访问'/usr/local/bin/runc': 没有那个文件或目录

task 倒是能执行了,但是什么都没有啊😭

然后去 google百度相关问题,发现 stackoverflow 上有一个相同的问题 Ansible unarchive module fails with Unexpected error when accessing exploded file: [Errno 2] No such file or directory:,其中有提到在 task 中设置 environment,但是该设置并未起作用😭

main.yaml
1
2
3
4
5
6
7
8
9
10
11
12
- name: "Unarchive {{ (ansible_architecture, docker_package) | path_join }} into {{ docker_deploy_path }}"
ansible.builtin.unarchive:
src: "{{ (ansible_architecture, docker_package) | path_join }}"
dest: "{{ docker_deploy_path }}"
owner: root
group: root
extra_opts:
- --strip-components=1
environment:
LANG: C
LC_ALL: C
LC_MESSAGES: C

执行 playbook 依然出现如下错误提示

ansible-playbook
1
2
TASK [docker : Unarchive x86_64/docker-20.10.8.tgz into /usr/local/bin] *********************************************************************************************
fatal: [test-server]: FAILED! => {"changed": false, "dest": "/usr/local/bin", "gid": 0, "group": "root", "handler": "TgzArchive", "mode": "0755", "msg": "Unexpected error when accessing exploded file: [Errno 2] 没有那个文件或目录: b'/usr/local/bin/docker-proxy'", "owner": "root", "size": 4096, "src": "/home/akiya/.ansible/tmp/ansible-tmp-1646098653.2653158-1453361-209743558196063/source", "state": "directory", "uid": 0}

后面又去 github ansible issues 页面搜索看能否找到相关的解决方案,发现都无法解决我现有的问题[○・`Д´・ ○]

在一番折腾无果后,回忆起来之前代码是能跑的,但是现在不能跑。原因是使用了 appimage 封装了个 ansible 的工具,当时使用了最新的 ansible 版本,也就把开发环境中 ansible 同样升级了!之前的版本是可以跑的,遂切换会老版本尝试。

  • ansible-core 版本:2.11.0

  • ansible 版本:4.0.0

执行 playbook 后日志如下,发现果然好使了,能够正确的上传文件并解压

ansible-playbook
1
2
TASK [docker : Unarchive x86_64/docker-20.10.8.tgz into /usr/local/bin] *********************************************************************************************
changed: [test-server]

目标服务器上查看目录结果如下:

shell
1
2
3
4
5
6
7
8
9
10
# ls -l /usr/local/bin/{docker*,containerd*,ctr,runc}
-rwxr-xr-x 1 root root 33900136 Jul 30 2021 /usr/local/bin/containerd
-rwxr-xr-x 1 root root 6508544 Jul 30 2021 /usr/local/bin/containerd-shim
-rwxr-xr-x 1 root root 8609792 Jul 30 2021 /usr/local/bin/containerd-shim-runc-v2
-rwxr-xr-x 1 root root 21131264 Jul 30 2021 /usr/local/bin/ctr
-rwxr-xr-x 1 root root 52883072 Jul 30 2021 /usr/local/bin/docker
-rwxr-xr-x 1 root root 64758664 Jul 30 2021 /usr/local/bin/dockerd
-rwxr-xr-x 1 root root 708616 Jul 30 2021 /usr/local/bin/docker-init
-rwxr-xr-x 1 root root 2784649 Jul 30 2021 /usr/local/bin/docker-proxy
-rwxr-xr-x 1 root root 11943136 Jul 30 2021 /usr/local/bin/runc

🤔这就奇怪了,为什么降级后就可以了呢?好奇心驱使下去看了下 ansible 源码,在分支 stable-2.12modules 目录下发现 unarchive.py 有如下改动

unarchive.py 改动

[stable-2.12] unarchive: fix non-english locales (#76542) (#76933)

  • unarchive: fix non-english locales

For GNU Gettext, the LANGUAGE environment variable takes precedence over LANG or LC_ALL. On systems where LANGUAGE was set to a non-english locale, the output of the tar command therefore not understood and the module failed silently (“changed”: false, but the archive was not extracted).

  • add tests

  • changelog
    (cherry picked from commit 49e1cb9)

Co-authored-by: Jonathan Neuhauser jonathan.hofinger@gmx.de

Co-authored-by: Jonathan Neuhauser jonathan.hofinger@gmx.de

大致意思为:

对于GNU Gettext,环境变量 LANGUAGE 优先于 LANGLC_ALL。在语言设置为非英语语言环境的系统上,tar 命令的输出因此不被理解,模块以静默方式失败(ansible task 返回 "changed": false,但未提取存档文件)。

那么,是否我在 task 中设置环境变量 LANGUAGE=en_US 就可以了呢?调整 playbook 脚本为如下内容:

main.yaml
1
2
3
4
5
6
7
8
- name: "Unarchive {{ (ansible_architecture, docker_package) | path_join }} into {{ docker_deploy_path }}"
ansible.builtin.unarchive:
src: "{{ (ansible_architecture, docker_package) | path_join }}"
dest: "{{ docker_deploy_path }}"
extra_opts:
- --strip-components=1
environment:
LANGUAGE: en_US

ansible-core 版本 切换回 2.12.1 后执行结果如下:

ansible-playbook
1
2
TASK [docker : Unarchive x86_64/docker-20.10.8.tgz into /usr/local/bin] ************************************************************************************************************
changed: [test-server]

可以看到 task 执行成功了,再看看目标服务器上对应的目录

shell
1
2
3
4
5
6
7
8
9
10
# ls -l /usr/local/bin/{docker*,containerd*,ctr,runc}
-rwxr-xr-x 1 root root 33900136 Jul 30 2021 /usr/local/bin/containerd
-rwxr-xr-x 1 root root 6508544 Jul 30 2021 /usr/local/bin/containerd-shim
-rwxr-xr-x 1 root root 8609792 Jul 30 2021 /usr/local/bin/containerd-shim-runc-v2
-rwxr-xr-x 1 root root 21131264 Jul 30 2021 /usr/local/bin/ctr
-rwxr-xr-x 1 root root 52883072 Jul 30 2021 /usr/local/bin/docker
-rwxr-xr-x 1 root root 64758664 Jul 30 2021 /usr/local/bin/dockerd
-rwxr-xr-x 1 root root 708616 Jul 30 2021 /usr/local/bin/docker-init
-rwxr-xr-x 1 root root 2784649 Jul 30 2021 /usr/local/bin/docker-proxy
-rwxr-xr-x 1 root root 11943136 Jul 30 2021 /usr/local/bin/runc

结论

从结果来看是已经解决问题了。但是我还是比较好奇为什么 2.11.0 上可以正常执行的脚本到 2.12.1 则不行了,直接从代码上找原因吧

  • commit cd0ebed875 中的 unarchive.py 改动为在 run_command 中新增了 LANGUAGE=locale

    unarchive.py
    1
    rc, out, err = self.module.run_command(cmd, cwd=self.b_dest, environ_update=dict(LANG=locale, LC_ALL=locale, LC_MESSAGES=locale, LANGUAGE=locale))

    用于解决非英语环境上 tar 命令输出不被理解问题

  • commit 61900c7672 中 unarchive.py run_commandLANG='C', LC_ALL='C', LC_MESSAGES='C' 被修改为了 LANG=locale, LC_ALL=locale, LC_MESSAGES=locale

    unarchive.py
    1
    rc, out, err = self.module.run_command(cmd, cwd=self.b_dest, environ_update=dict(LANG=locale, LC_ALL=locale, LC_MESSAGES=locale))

    同样的,在分支 stable-2.11 的 unarchive.pyrun_command 中也为 LANG='C', LC_ALL='C', LC_MESSAGES='C'

分析:之前使用的 2.11 版本之所为可以直接解压是因为代码中默认帮我们设置了 LANG 等环境变量的值,但是到 2.12 后由于修复 bug 引入了 LANGUAGE,并且 LANGUAGE 的优先级是高于 LANGLC_ALL 的,所以 LANGUAGE 如果配置不正确则会在解压时报错。

查看 Kylin 4.0.2 上的 locale

shell
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# locale
LANG=zh_CN.UTF-8
LANGUAGE=zh_CN:zh
LC_CTYPE="zh_CN.UTF-8"
LC_NUMERIC="zh_CN.UTF-8"
LC_TIME="zh_CN.UTF-8"
LC_COLLATE="zh_CN.UTF-8"
LC_MONETARY="zh_CN.UTF-8"
LC_MESSAGES="zh_CN.UTF-8"
LC_PAPER="zh_CN.UTF-8"
LC_NAME="zh_CN.UTF-8"
LC_ADDRESS="zh_CN.UTF-8"
LC_TELEPHONE="zh_CN.UTF-8"
LC_MEASUREMENT="zh_CN.UTF-8"
LC_IDENTIFICATION="zh_CN.UTF-8"
LC_ALL=

发现 LANGUAGE 值为 zh_CN:zh 🤔并且在 CentOS 7Ubuntu 14.04Ubuntu 16.04Ubuntu 20.04 等操作系统上并未默认设置该变量。经过对比测试发现此次问题的原因就是这个 LANGUAGE=zh_CN:zh 引起的,但是通过命令 localectl set-locale LANGUAGE=zh_CN:zhCentOSUbuntu 上测试并未出现相同问题,那么最终结果是可能国产操作系统上还有什么原因是我没找到的。

解决方案

以下方案二选一即可

  • 设置环境变量 LANGUAGEen_US

    main.yaml
    1
    2
    3
    4
    5
    6
    7
    8
    - name: "Unarchive {{ (ansible_architecture, docker_package) | path_join }} into {{ docker_deploy_path }}"
    ansible.builtin.unarchive:
    src: "{{ (ansible_architecture, docker_package) | path_join }}"
    dest: "{{ docker_deploy_path }}"
    extra_opts:
    - --strip-components=1
    environment:
    LANGUAGE: en_US
  • 使环境变量 LANGUAGE 留空

    main.yaml
    1
    2
    3
    4
    5
    6
    7
    8
    - name: "Unarchive {{ (ansible_architecture, docker_package) | path_join }} into {{ docker_deploy_path }}"
    ansible.builtin.unarchive:
    src: "{{ (ansible_architecture, docker_package) | path_join }}"
    dest: "{{ docker_deploy_path }}"
    extra_opts:
    - --strip-components=1
    environment:
    LANGUAGE:
评论

:D 一言句子获取中...

加载中,最新评论有1分钟缓存...