免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 1156 | 回复: 1
打印 上一主题 下一主题

推荐一个修复无法正常vi或rm的怪异文件名的ksh程序(fixnames) [复制链接]

论坛徽章:
1
荣誉版主
日期:2011-11-23 16:44:17
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2004-11-25 11:32 |只看该作者 |倒序浏览
功能:搜索当前目录及所有子目录所有的奇怪文件名(problem filenames),
自动提示改名(Move)或删除(Remove)或忽略(Skip),
额外功能是检测文件后缀名是否不当:
比如内容为C程序的文件如果名字是a.txt,会自动提示是否换名。

SCO下测试通过。
简单用法:
$ fixnames.ksh -h 看帮助
$ fixnames.ksh -r 检查当前目录及所有子目录

下面是fixnames.ksh内容,UNIX博大精深可见一斑。

  1. #!/bin/ksh
  2. # @(#) fixnames.ksh 1.1 95/07/30
  3. # 95/07/16 John H. DuBois III (john@armory.com)
  4. # 95/07/20 Catch filenames that start with -
  5. # 95/07/21 Deal with bad symlinks & names that contain whitespace correctly.
  6. #          Print filenames that contain \ correctly.  Added xH options.
  7. # 95/07/30 Added r option.

  8. name=${0##*/}
  9. Usage="Usage: $name [-hHilnr] [-c<character-list>] [name ...]"
  10. ExpandDirs=true
  11. ReadInput=false
  12. ListOnly=false
  13. Debug=false
  14. Recurse=false

  15. ### Argument processing

  16. while getopts :hlinc:rxH opt; do
  17.     case $opt in
  18.     h)
  19.         echo \
  20. "$name: deal with filenames that contain problematic characters.
  21. $name generates a file list from the list of file and directory names given
  22. on the command line, expanding directory names into a list of the files they
  23. contain.  If no names are given, the file list is generated from the contents
  24. of the current working directory.  The names in the list are then checked for
  25. the existance of characters which may be difficult to enter at the command
  26. line, due to lack of keyboard support or because they are special to one of the
  27. standard shells.  Any names that contain such characters are presented to the
  28. user, with options given to rename or remove them.  When names are printed, any
  29. unprintable characters are presented in ^X (for characters below ASCII 128) or
  30. octal (for characters >= ASCII 128) notation.
  31. $Usage
  32. Options:
  33. -c<character-list>: Add the given characters to the list of legal characters.
  34.     The existance of these characters in a filename will then not cause the
  35.     filename to be considered a problem.
  36. -h: Print this help.
  37. -H: Print a description of the problems that certain characters can cause.
  38. -i: Read a list of filenames from the standard input, one per line.  Directory
  39.     names are not expanded into the files they contain.  NOTE -- filenames
  40.     containing newlines cannot be read from the standard input!
  41. -r: Recursively search any of the named files that are directories, or the
  42.     current directory if no names are given, searching for problem names.
  43. -l: List problem filenames along with information about the files, but do not
  44.     do anything to them.
  45. -n: Do not expand directory names given on the command line into the files
  46.     they contain.  This allows the directory names themselves to be acted on."
  47.         exit 0
  48.         ;;
  49.     H)  print -r \
  50. "The following lists the characters considered to be problem characters when
  51. used in filenames, and the reasons they cause difficulties:
  52. \"Non-printing\" characters (most of those with ASCII values below 33 and above
  53.   126) will print in a manner dependent on the display, possibly not printing
  54.   at all or being interpreted as screen control characters.  Some are difficult
  55.   to enter at a keyboard because they have special meaning to the serial driver
  56.   or line discipline.
  57. Tab and space may be confused with each other, and require quoting when entered
  58.   at the command line.
  59. Newline requires quoting, is difficult to quote in some shells, and confuses
  60.   any utility that expects a newline-separated list of filenames.
  61. ! is special to csh and some other shells, with its special meaning having a
  62.   high enough precedence that it can be difficult for novices to escape.
  63. # is the comment character, special to the shells when appearing at the start
  64.   of a word.
  65. ~ is the home directory expansion character, special to csh, ksh, and some
  66.   other shells when appearing at the start of a word.
  67. - at the start of an argument list is special to most commands.
  68. *?[]<>|$&();\` are used for filename globbing, IO redirection, variable
  69.   expansion, backgrounding, subshell statements, statement separation, and
  70.   command line substitution.  They must be be quoted or escaped.
  71. \"'\\ are quoting and escape characters, and can be tricky to quote or escape.
  72. {} are used to generate argument lists in csh, ksh, and some other shells.
  73. ^ is the old pipe character, still used by sh.
  74. Characters with ASCII values above 128 cannot be entered at most keyboards
  75.   without special support, and have no standard display representation."
  76.         exit 0;
  77.         ;;
  78.     c)
  79.         ExtraChars="$OPTARG"
  80.         ;;
  81.     l)
  82.         ListOnly=true
  83.         ;;
  84.     n)
  85.         ExpandDirs=false
  86.         ;;
  87.     i)
  88.         ReadInput=true
  89.         ;;
  90.     x)
  91.         Debug=true
  92.         ;;
  93.     r)
  94.         Recurse=true
  95.         ;;
  96.     +?)
  97.         print -u2 "$name: options should not be preceded by a '+'."
  98.         exit 1
  99.         ;;
  100.     :)
  101.         print -r -u2 -- \
  102.         "$name: Option '$OPTARG' requires a value.  Use -h for help."
  103.         exit 1
  104.         ;;
  105.     ?)
  106.         print -u2 "$name: $OPTARG: bad option.  Use -h for help."
  107.         exit 1
  108.         ;;
  109.     esac
  110. done

  111. # remove args that were options
  112. let OPTIND=OPTIND-1
  113. shift $OPTIND

  114. ### Function definitions

  115. # uSelect.ksh 1.0 95/01/02
  116. # 95/01/02 john h. dubois iii (john@armory.com)

  117. # Usage: uSelect prefix prompt-string word ...
  118. # uSelect presents the words in a select list.
  119. # If prefix is non-null, it is printed before the select list.
  120. # If prompt-string is null, a default prompt is used.
  121. # Each word must have at least one capital letter in it.
  122. # A word may be selected by number or by entering the first capital letter
  123. # in it (in upper or lower case).  The letter (always capitalized) is returned
  124. # in the global var uSelect_ret.  If more than one word is entered, the extra
  125. # words are returned in the global var uSelect_params.
  126. # If a word contains '#', the part before the '#' is used in the select list.
  127. # If no extra parameters are entered, then the part after the '#' is used as a
  128. # secondary prompt, and the words entered in response are assigned to
  129. # uSelect_params.  The selection fails if no words are entered.
  130. # The entire selection selection string is returned in the global variable
  131. # uSelect_selected.
  132. # The return status is the number of the word selected (starting with 1),
  133. # or 0 if EOF or 'q' is entered.
  134. # Example usage:
  135. #     uSelect "Current file: $file" "" "View" "Move#Move to:" && Finish
  136. #     case "$uSelect_ret" in
  137. #         V)  foo "$file";;
  138. #         M)  [ -n "$uSelect_params" ] && $mv -- "$file" "$uSelect_params"
  139. #     esac
  140. # This would produce the display:
  141. #     Current file: bsize
  142. #     1) View
  143. #     2) Move
  144. #     Select by # or letter (hit return for options); use Q to quit:
  145. # The final line would be replaced by the prompt-string if one was given.

  146. typeset -uL1 uSelect_ret
  147. function uSelect {
  148.     typeset PS3 Cmd word letters= Options FullOpts Prefix="
  149. $1"
  150.     typeset -uL1 l
  151.     typeset -u U
  152.     typeset -i CmdNum=1

  153.     # Quit is not added as a regular menu item so that one more regular item
  154.     # will fit on the screen.  Instead, list it in prompt.
  155.     # Had to shorten this prompt because ksh wouldn't let it be longer!
  156.     [ -n "$2" ] && PS3=$2 || PS3=\
  157. "Select by # or letter (hit return for options); use Q to quit: "
  158.     shift 2
  159.     set -A FullOpts -- "" "$@"
  160.     for word; do
  161.         Options[CmdNum]=${word%#*}
  162.         l=${word##*([!A-Z])}
  163.         letters=$letters$l
  164.         let CmdNum+=1
  165.     done
  166.     # Do the elif test in two [[]] so that the one eval'ed is only done if
  167.     # $REPLY is an upper-case letter
  168.     # Use -r wherever Prefix is used because it might contain \ sequences
  169.     # that should be printed literally.
  170.     print -r -u2 "$Prefix"
  171.     select Cmd in "${Options[@]}"; do
  172.         set -- $REPLY
  173.         U=$1
  174.         [ "$U" = Q ] && return 0        # Quit option was selected by letter
  175.         if [ -n "$Cmd" ]; then
  176.             CmdNum=$1
  177. #            # If Quit option was selected by number...
  178. #            [ CmdNum -gt ${#Options[*]} ] && return 0
  179.             uSelect_ret=${Cmd##*([!A-Z])}
  180.         elif [[ "$1" = [a-zA-Z] ]] && eval [[ "$letters" = "*$U*" ]]; then
  181.             uSelect_ret=$U
  182.             letters=${letters%%$U*}
  183.             CmdNum=${#letters}+1
  184.         else
  185.             print -u2 "Invalid selection."
  186.             print -r -u2 "$Prefix"
  187.             continue
  188.         fi
  189.         shift
  190.         uSelect_params="$*"
  191.         word=${FullOpts[CmdNum]}
  192.         if [[ -z "$uSelect_params" && "$word" = *#* ]]; then
  193.             print -nr -u2 "${word##*#} "
  194.             read
  195.             if [ -z "$REPLY" ]; then
  196.                 print -u2 "Must give a non-null response."
  197.                 print -r -u2 "$Prefix"
  198.                 continue
  199.             fi
  200.             uSelect_params=$REPLY
  201.         fi
  202.         uSelect_selected=${Options[CmdNum]}
  203.         return $CmdNum
  204.     done
  205.     return 0
  206. }

  207. # Can't pipe names through awk because before uncontrol they might have
  208. # embedded newlines.  So, must pass all names on cmd line.  Use a separate
  209. # invokation for each filename so that argument space isn't exceeded;
  210. # using this utility on current dir might generate a very large name list.
  211. function Uncontrol {
  212. awk '
  213. BEGIN {
  214.     MakeUncontrolTable()
  215.     print Uncontrol(ARGV[1])
  216. }
  217. # @(#) uncontrol.awk 92/11/09
  218. # Uncontrol(S): Convert control characters in S to symbolic form.
  219. # Characters in S with values < 32 and with value 128 are converted to the form
  220. # ^X.  Characters with value >= 128 are converted to the octal form \0nnn.
  221. # The resulting string is returned.
  222. # Global variables: UncTable[] must be initialized to a lookup table by
  223. # MakeUncontrolTable() before using this function.
  224. function Uncontrol(S,  i,len,Output) {
  225.     len = length(S)
  226.     Output = ""
  227.     for (i = 1; i <= len; i++)
  228.         Output = Output UncTable[substr(S,i,1)]
  229.     return Output
  230. }

  231. # MakeUncontrolTable: Make a table for use by Uncontrol().
  232. # Global variables:
  233. # UncTable[] is made into a character -> symbolic character lookup table.
  234. function MakeUncontrolTable(  i) {
  235.     for (i = 0; i < 32; i++)
  236.         UncTable[sprintf("%c",i)] = "^" sprintf("%c",i + 64)
  237.     for (i = 32; i < 127; i++)
  238.         UncTable[sprintf("%c",i)] = sprintf("%c",i)
  239.     UncTable[sprintf("%c",127)] = "^?"
  240.     for (i = 128; i < 256; i++)
  241.         UncTable[sprintf("%c",i)] = "\\" sprintf("%03o",i)
  242. }
  243. ' "$1"
  244. }

  245. function GetInfo {
  246.     typeset file=$1 out ftype

  247.     info=$(l -d -- "$file")
  248.     info="${info% $file}"

  249.     ftype=$(file -- "$file")
  250.     ftype="${ftype##$file:*([         ])}"
  251.     print -r "File type: $ftype
  252. $info"
  253. }

  254. # Usage: FixName filename printable-name
  255. function FixName {
  256.     typeset File=$1 PrintableName=$2
  257.     typeset info=$(GetInfo "$File")

  258.     if $ListOnly; then
  259.         print -r -- "$PrintableName: $info $PrintableName"
  260.         return
  261.     fi
  262.     uSelect "File: $PrintableName
  263. $info $PrintableName" "" "Move#Move to:" Remove Skip && exit 0
  264.      case "$uSelect_ret" in
  265.      M)  [ -n "$uSelect_params" ] && mv -- "$file" "$uSelect_params";;
  266.      R)  if [ -d "$file" ]; then
  267.              rmdir -- "$file" ||  print -u2 \
  268. "$name: The directory could not be removed.  If the directory is not empty,
  269. you should instead use the 'm' option to rename it so that its contents can
  270. be dealt with."
  271.          else
  272.              rm -- "$file"
  273.          fi;;
  274.      esac
  275. }

  276. # Usage: CheckName filename
  277. # Globals: foundBad
  278. function CheckName {
  279.     typeset file=$1

  280.     $Debug && print -r -u2 "Checking file: $(Uncontrol "$file")"
  281.     if [ ! -a "$file" ]; then
  282.         print -r -u2 "$name: file does not exist: $(Uncontrol "$file")"
  283.     # All of the other characters have meaning to various shells.
  284.     # Check only the filename part of the path to be tested.
  285.     elif TestFilename "$file"; then
  286.         FixName "$file" "$(Uncontrol "$file")"
  287.         foundBad=true
  288.     fi
  289.     return 0
  290. }

  291. # Globals: ExtraChars
  292. function TestFilename {
  293.     typeset file="${file##*/}"

  294.     # Difficult to add metachars held in shell var to a pattern, since the
  295.     # statement must then be eval'ed, causing problems.  So, use awk to
  296.     # to do the test.
  297.     if [ -n "$ExtraChars" ]; then
  298.         awk '
  299. BEGIN {
  300.     Chars = ARGV[1]
  301.     Name = ARGV[2]
  302.     len = length(Name)
  303.     # Since the extra chars are liable to be metachars, avoid using them in
  304.     # any pattern match.  Instead, just remove them from the filename before
  305.     # comparing it to the legal name pattern.
  306.     Output = ""
  307.     for (i = 1; i <= len; i++)
  308.         if (!index(Chars,c = substr(Name,i,1)))
  309.             Output = Output c
  310.     if (Output !~ /^[-a-zA-Z0-9@%_+=:.,~#]*$/ || Output ~ /^[-~#]/)
  311.         exit 0
  312.     else
  313.         exit 1
  314. }
  315. ' "$ExtraChars" "$file"
  316.     else
  317.         [[ "$file" != +([-a-zA-Z0-9@%_+=:.,~#]) || "$file" = [-~#]* ]]
  318.     fi
  319. }

  320. # @(#) ftw.ksh 1.0 95/07/30
  321. # 95/07/30 john h. dubois iii
  322. # Usage: ftw Path Command [optional-arg ...]
  323. # ftw will recursively search the directory tree rooted at Path, invoking
  324. # on each filename found, starting with Path itself:
  325. # Command optional-args Filename
  326. # where Command is the command name given, optional-args are any fixed args
  327. # given for Command, and Filename is the filename found.  ftw does not follow
  328. # symlinks.  If Path is not a directory, Command is invoked once, with that
  329. # filename.  ftw does a depth-first search, processing each directory before
  330. # the files in the directory.  ftw is liable to use a significant
  331. # amount of memory, causing the script interpreter to permanently grow in size.
  332. # ftw skips the . and .. entries in each directory, except for the Path given,
  333. # which will be processed even if it is . or ..
  334. # ftw will abort if Command ever exits nonzero.
  335. # Return value:
  336. # 0 on success.  1 if invoked incorrectly.
  337. # 2 if Command ever exits nonzero.
  338. function ftw {
  339.     set +o noglob        # Turn on globbing; local to this function
  340.     typeset Root=$1 file pat OIFS=$IFS IFS Command

  341.     [ $# -lt 2 ] && return 1

  342.     shift
  343.     set -A Command -- "$@"
  344.     "${Command[@]}" "$Root" || return 2
  345.     # ksh directory test returns true for symlink that points to dir, so must
  346.     # test for that everywhere that we test for dir
  347.     [ -L "$Root" -o ! -d "$Root" ] && return 0
  348.     for pat in "$Root/.[!.]*" "$Root/*"; do
  349.         IFS=
  350.         set -- $pat
  351.         IFS=$OIFS
  352.         [ $# = 1 -a "$1" = "$pat" -a ! -a "$1" ] && continue
  353.         for file; do
  354.             if [ ! -L "$file" -a -d "$file" ]; then
  355.                 ftw "$file" "${Command[@]}" || return 2
  356.             else
  357.                 "${Command[@]}" "$file" || return 2
  358.             fi
  359.         done
  360.     done
  361.     return 0
  362. }

  363. ### Start of main program

  364. unset Files[*]
  365. if $ReadInput; then
  366.     :
  367. elif $Recurse; then
  368.     [ $# -eq 0 ] && set -A Files . || set -A Files -- "$@"
  369. elif [ $# -eq 0 ]; then
  370.     # Find all files except . and ..
  371.     set -A Files -- .[!.]* *
  372.     typeset -i i=0 nf=${#Files[*]}
  373.     while [ i -lt nf ]; do
  374.         file=${Files[i]}
  375.         if [ ! -a "$file" ]; then
  376.             if [ "$file" = "*" -o "$file" = ".[!.]*" ]; then
  377.                 unset Files[i]
  378.             elif [ -L "$file" ]; then
  379.                 # If a filename that results from globbing does not exist
  380.                 # and is a symlink, assume it is a bad symlink rather than
  381.                 # a globbing failure, since bad symlinks are normal while
  382.                 # globbing should never fail.
  383.                 print -r -u2 \
  384.                 "$name: Note: found bad symlink: $(Uncontrol "$file")"
  385.                 unset Files[i]
  386.             else
  387.                 print -r -u2 \
  388. "$name: globbing failure - got filename '$(Uncontrol "$file")'
  389. from directory listing, but it does not exist.  Continuing..."
  390.             fi
  391.         fi
  392.         let i+=1
  393.     done
  394.     if [ ${#Files[*]} -eq 0 ]; then
  395.         print -u2 "$name: No files found in current directory.  Exiting."
  396.         exit 1
  397.     fi
  398. else
  399.     if $ExpandDirs; then
  400.         # Expand directory names given on command line into a list of the files
  401.         # they contain.
  402.         for file in "$@"; do
  403.             if [ -d "$file" ]; then
  404.                 set -- "$file"/*
  405.                 # If no files in dir, skip it
  406.                 [[ $# -eq 1 && "$1" = *"/*" && ! -a "$1" ]] && continue
  407.                 set -A Files -- "${Files[@]}" "$@"
  408.                 TestFilename "$file" && print -r -u2 \
  409. "$name: note: the directory name '$(Uncontrol "$file")' contains problem
  410. characters.  Use the -n option to act on directory names."
  411.             else
  412.                 set -A Files -- "${Files[@]}" "$file"
  413.             fi
  414.         done
  415.     else
  416.         set -A Files -- "$@"
  417.     fi
  418. fi

  419. set -o noglob
  420. foundBad=false

  421. if $ReadInput; then
  422.     while read -r file; do
  423.         if $Recurse; then
  424.             ftw "$file" CheckName </dev/tty
  425.         else
  426.             CheckName "$file" </dev/tty
  427.         fi
  428.     done
  429. else
  430.     for file in "${Files[@]}"; do
  431.         if $Recurse; then
  432.             ftw "$file" CheckName
  433.         else
  434.             CheckName "$file"
  435.         fi
  436.     done
  437. fi

  438. $foundBad || print -- "$name: No problem filenames found."
复制代码

论坛徽章:
0
2 [报告]
发表于 2004-11-26 10:52 |只看该作者

推荐一个修复无法正常vi或rm的怪异文件名的ksh程序(fixnames)

汗,好长阿。。。
我一般都是ls > log
然后再把log改成脚本来删,感觉还要容易一点。。
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP