Chinaunix

标题: 字符串分割问题 [打印本页]

作者: 93237984 时间: 2007-07-25 21:23
标题: 字符串分割问题
cat file

12345678912345
23456789900086
12345678903424
32445434325767
...

分割成：

12,345,678,9123,45
23,456,789,9000,86
12,345,678,9034,24
32,445,434,3257,67
...

用什么方法效率较高？

作者: doctorjxd 时间: 2007-07-25 21:46
什么方法效率最高不敢妄下结论

sed 's/$..$$...$$...$$....$$..$/\1,\2,\3,\4,\5/' urfile

复制代码

作者: 93237984 时间: 2007-07-25 22:14
awk '{a1=substr($1,1,2);a2=substr($1,3,3);a3=substr($1,6,3);a4=substr($1,9,4);a5=substr($1,13);print a1,a2,a3,a4,a5}' file

因为要处理的文件有1M左右，而且要连续不断地处理，所以性能是需要考虑的。特别是CPU和IO
不知道上面的方法跟你的方法比起来哪个好。

有没有高手分析一下。

作者: 寂寞烈火 时间: 2007-07-25 23:16
while read str;do echo ${str:0:2},${str:2:3},${str:5:3},${str:8:4},${str:12:2};done<urfile

作者: bitbull 时间: 2007-07-26 10:17
1M并不算大,如果数据格式不变,可以采用这种比较不灵活的办法,效率应该能比脚本高点?没比较过

#include <stdio.h>
int main(int argc, char *argv[])
{
char buff[128];
FILE *fp;
if (argc != 2)
{
printf("USAGE: %s file.txt\n", argv[0]);
return 1;
}
if ((fp = fopen(argv[1], "r")) == NULL)
{
printf("can't open %s\n", argv[1]);
return 1;
}
while (fscanf(fp, "%s\n", buff) > 0)
{
printf("%c%c,%c%c%c%,%c%c%c,%c%c%c%c,%c%c\n", \
buff[0], buff[1], buff[2], buff[3], buff[4], buff[5], buff[6], \
buff[7], buff[8], buff[9], buff[10], buff[11], buff[12], buff[13]);
}
fclose(fp);
return 0;
}

复制代码

gcc a.c -o a.out
a.out test.txt

复制代码

作者: luo118 时间: 2007-07-26 11:11
建议用楼二方法,方便,简易,一条指命完成,效率也不错的.

欢迎光临 Chinaunix (http://bbs.chinaunix.net/)