微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

linux – 无法通过变量内部的引号传递wget变量

我正在尝试编写一个wget命令脚本来下载一个网页及其所有的附件和jpeg等.

当我手动输入脚本时,它可以工作,但我需要运行这个超过35000次来存档一个不受我控制的旧网站(国际公司政治,但我是数据的所有者).

我的问题在于改变会话参数.

我的脚本到目前为止如下:

cnt=35209
# initialise the headers
general_settings='-4 -P xyz --restrict-file-names=windows -nc --limit-rate=250k'
html_page_specific='--convert-links --html-extension'
proxy='--proxy-user=xxxxxx --proxy-password=yyyyyyy' 
session="--header=\'Host: mywebsite.com:9090\' --header=\'User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:20.0) Gecko/20100101 Firefox/20.0\'"
address=http://mywebsite.com:9090/browse/item-$cnt

echo $general_settings $proxy $session $cookie $address
echo
echo
echo Getting item-$cnt...

#while [ $cnt -gt 0 ]
#do
#  # get the page
  wget --debug $general_settings $html_page_specific $proxy $session $cookie $address

  # Now get the attachments, pdf, txt, jpg, gif, sql, etc...
#  wget -A.pdf  $general_settings -r $proxy $session $cookie $address
#  wget -A.txt  $general_settings -r $proxy $session $cookie $address
#  wget -A.jpg  $general_settings -r $proxy $session $cookie $address
#  wget -A.gif  $general_settings -r $proxy $session $cookie $address
#  wget -A.sql  $general_settings -r $proxy $session $cookie $address
#  wget -A.doc  $general_settings -r $proxy $session $cookie $address
#  wget -A.docx $general_settings -r $proxy $session $cookie $address
#  wget -A.xls  $general_settings -r $proxy $session $cookie $address
#  wget -A.xlsm $general_settings -r $proxy $session $cookie $address
#  wget -A.xlsx $general_settings -r $proxy $session $cookie $address
#  wget -A.xml  $general_settings -r $proxy $session $cookie $address
#  wget -A.ppt  $general_settings -r $proxy $session $cookie $address
#  wget -A.pptx $general_settings -r $proxy $session $cookie $address
#  wget -A.png  $general_settings -r $proxy $session $cookie $address
#  wget -A.ps   $general_settings -r $proxy $session $cookie $address
#  wget -A.mdb  $general_settings -r $proxy $session $cookie $address
#  ((cnt=cnt-1))
#
#done

但是当我运行脚本时,我得到以下输出

Getting item-35209...
Setting --inet4-only (inet4only) to 1
Setting --directory-prefix (dirprefix) to xyz
Setting --restrict-file-names (restrictfilenames) to windows
Setting --no (noclobber) to 1
Setting --limit-rate (limitrate) to 250k
Setting --convert-links (convertlinks) to 1
Setting --html-extension (htmlextension) to 1
Setting --proxy-user (proxyuser) to xxxxx
Setting --proxy-password (proxypassword) to yyyyy
Setting --header (header) to \'Host:
Setting --header (header) to 'Cookie:
DEBUG output created by Wget 1.11.4 Red Hat modified on linux-gnu.

如您所见,Host和Cookie部分格式不正确,导致wget命令无法登录提取数据.

我一直在阅读bash手册页,谷歌搜索,并尝试了几个相关的建议,但我仍然无法得到命令执行.

那里的任何人都会很好地向我展示引用qia引号的正确方法吗?

谢谢,

解决方法:

引用的字符串或变量内的引号是普通字符,而不是引用字符.没有办法改变这一点.改为使用数组:

A=(a b 'c d' 'e f')
cmd "${A[@]}"

使用四个参数a,b,c d和e f调用cmd.

(你可以用eval实现类似的效果,但这更容易出错.在你的情况下,使用数组会更方便.)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。

相关推荐