我正在尝试删除几列,然后删除文件内容的唯一内容.我要删除的列是月,日,时间和纪元时间;它们在每一行中都是不同的,不能让我对文件内容唯一.
sample.log的样本内容:
Jun 5 05:13:13 AAA AAA AAAA 1433495593.306611 XXXX CCCC CCCC AAAA Sdddd DFFFFF111
Jun 5 05:13:14 AAA AAA AAAA 1433495594.306612 XXXX CCCC CCCC AAAA Sdddd DFFFFF222
Jun 5 05:13:13 AAA AAA AAAA 1433495593.306611 XXXX CCCC CCCC AAAA Sdddd DFFFFF111
Jun 5 05:13:15 AAA AAA AAAA XXXXX 1433495596.306614 XXXX CCCC CCCC AAAA Sdddd DFFFFF111
Jun 5 05:13:16 AAA AAA AAAA XXXXX 1433495597.306615 XXXX CCCC CCCC AAAA Sdddd DFFFFF333
Jun 5 05:13:17 AAA AAA AAAA XXXXX 1433495598.306616 XXXX CCCC CCCC AAAA Sdddd DFFFFF444
问题:
月,日期,时间在固定列中,但是时期在7号和8号列之间切换.想知道如何处理.
样本输出:
Jun 5 05:13:13 AAA AAA AAAA 1433495593.306611 XXXX CCCC CCCC AAAA Sdddd DFFFFF111
Jun 5 05:13:13 AAA AAA AAAA 1433495593.306611 XXXX CCCC CCCC AAAA Sdddd DFFFFF111
Jun 5 05:13:15 AAA AAA AAAA XXXXX 1433495596.306614 XXXX CCCC CCCC AAAA Sdddd DFFFFF111
如果以上要求太多,则如下所示:
AAA AAA AAAA 1433495593.306611 XXXX CCCC CCCC AAAA Sdddd DFFFFF111
AAA AAA AAAA 1433495593.306611 XXXX CCCC CCCC AAAA Sdddd DFFFFF111
AAA AAA AAAA XXXXX 1433495596.306614 XXXX CCCC CCCC AAAA Sdddd DFFFFF111
我正在按照以下方向尝试操作,但效果不是很好.
while read line
do
seven=$(echo $line |awk '{print $7}')
eight=$(echo $line |awk '{print $8}')
if [[ "$seven" =~ "^[0-9]" ]];then
#echo "seventh column starts with number"
echo $line|awk '$1=$2=$3=$7=" " {print}'
else
#echo "Eighth column starts with number"
echo $line|awk '$1=$2=$3=$8=" " {print}'
fi
done < $1
更多示例:
Jun 5 05:13:13 AAA BBB CCC 142222222222.000 DDD EEE FFFF
Jun 5 05:13:13 AAA BBB CCC 142222222223.000 DDD EEE FFFF
Jun 5 05:13:14 AAA BBB CCC 142222222224.000 DDD EEE GGGG
Jun 5 05:13:13 AAA BBB CCC XXX 142222222225.000 DDD EEE GGGG
Jun 5 05:13:13 AAA BBB CCC XXX 142222222225.000 DDD EEE FFFF
Jun 5 05:13:13 AAA BBB CCC XXX 142222222226.000 DDD EEE FFFF
输出:
Jun 5 05:13:13 AAA BBB CCC 142222222223.000 DDD EEE FFFF
Jun 5 05:13:13 AAA BBB CCC 142222222223.000 DDD EEE GGGG
Jun 5 05:13:13 AAA BBB CCC XXX 142222222225.000 DDD EEE GGGG
Jun 5 05:13:13 AAA BBB CCC XXX 142222222225.000 DDD EEE FFFF
要么
输出:
AAA BBB CCC DDD EEE FFFF
AAA BBB CCC DDD EEE GGGG
AAA BBB CCC XXX DDD EEE GGGG
AAA BBB CCC XXX DDD EEE FFFF
解决方法:
如果我正确理解了这个问题,那么这里不需要Bash,只需Awk:
% awk '
{
for (f = 4; f <= NF; ++f) { # Start at column 4
if (f == 7 || f == 8) { # Treat columns 7 or 8 differently
if ($f !~ /^[0-9]+\.[0-9]+$/) { # Only print if non-numeric
printf $f " "
}
} else {
printf $f " "
}
}
printf "\n"
}
' sample.log
AAA AAA AAAA XXXX CCCC CCCC AAAA Sdddd DFFFFF111
AAA AAA AAAA XXXX CCCC CCCC AAAA Sdddd DFFFFF222
AAA AAA AAAA XXXX CCCC CCCC AAAA Sdddd DFFFFF111
AAA AAA AAAA XXXXX XXXX CCCC CCCC AAAA Sdddd DFFFFF111
AAA AAA AAAA XXXXX XXXX CCCC CCCC AAAA Sdddd DFFFFF333
AAA AAA AAAA XXXXX XXXX CCCC CCCC AAAA Sdddd DFFFFF444
要获取唯一行:
% awk '
{
for (f = 4; f <= NF; ++f) { # Start at column 4
if (f == 7 || f == 8) { # Treat columns 7 or 8 differently
if ($f !~ /^[0-9]+\.[0-9]+$/) { # Only print if non-numeric
printf $f " "
}
} else {
printf $f " "
}
}
printf "\n"
}
' sample2.log | sort -u
AAA BBB CCC DDD EEE FFFF
AAA BBB CCC DDD EEE GGGG
AAA BBB CCC XXX DDD EEE FFFF
AAA BBB CCC XXX DDD EEE GGGG
正在处理%s …
如果输入文件中包含%符号,则根据您的注释,您需要先将这些符号转义,然后再将它们传递给printf.你可以用这样的功能来做到这一点…
% awk '
function escape_percents(s)
{
gsub("%", "%%", s)
return s
}
{
for (f = 4; f <= NF; ++f) { # Start at column 4
if (f == 7 || f == 8) { # Treat columns 7 or 8 differently
if ($f !~ /^[0-9]+\.[0-9]+$/) { # Only print if non-numeric
printf escape_percents($f) " "
}
} else {
printf escape_percents($f) " "
}
}
printf "\n"
}
' sample2.log | sort -u
AAA BBB CCC DDD %E%E%E FFFF
AAA BBB CCC DDD %E%E%E GGGG
AAA BBB CCC XXX DDD %E%E%E FFFF
AAA BBB CCC XXX DDD %E%E%E GGGG
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。