12 / 2 页下一页

论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2010-01-27 00:28 |只看该作者 |倒序浏览

读取文件内容时能不能够每一行读进一个字符？而不是经常使用的一行一行的读进来。
因为文件的数据量好大，而且文件的个数也好多。所以突然间想到这个方法，如果可行的话应该可以提高效率吧。

文库|博客

chenhao392

稍有积蓄

论坛徽章:: 1

2楼 [报告]

发表于 2010-01-27 00:54 |只看该作者

There may cases where you need to read a file only a few characters at a time instead of line-by-line. This may be the case for binary data. To do just that you can use the read command.
open FILE, "picture.jpg" or die $!;
binmode FILE;
my ($buf, $data, $n);
while (($n = read FILE, $data, 4) != 0) {
print "$n bytes read\n";
$buf .= $data;
}
close(FILE);
There is a lot going on here so let's take it step by step. In the first line of the above code fragment a file is opened. As you can guess from the filename it is a binary file. Binary files need to treated differently than text files on some operating systems (eg, Windows). The reason is that on these platforms a newline "character" is actually represented within text files by the two character sequence \cM\cJ (that's control-M, control-J). When reading the text file Perl will convert the \cM\cJ sequence into a single \n newline characted. The converse also holds when writing files. Clearly, when reading binary data this behavior is undesired and calling binmode on the filehandle will make sure that this conversion is avoided.

The read command takes either 3 or 4 arguments. The 3-argument form is:
read FILEHANDLE, SCALAR, LENGTH
while the 4-argument form is:
read FILEHANDLE, SCALAR, LENGTH, OFFSET
In the first case LENGTH characters of data are read in the variable specified by SCALAR from FILEHANDLE. The return value of read is the number of characters actually read, 0 at the end of the file or undef in the case of an error. Returning to our example above the third line of code will read at most 4 characters of data into the $data variable. The number of characters read will be stored in $n. Successive read operations on the same filehandle will set the current file position to be just before the first unread character. Thus the code above will read the contents of the file picture.jpg and store them in $buf, printing the number of characters read at every iteration.

If OFFSET is specified then the characters read will be placed at that position within the SCALAR. Taking advantage of this we could rewrite the loop above as such:
my ($data, $n, $offset);
while (($n = read FILE, $data, 4, $offset) != 0) {
print "$n bytes read\n";
$offset += $n;
}

Even though the example above demonstrates binary reading the read command works just as well on text files - just make sure to use (for binary) or not use (for text) binmode accordingly.

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

panwenbo363

白手起家

论坛徽章:: 0

3楼 [报告]

发表于 2010-01-27 00:58 |只看该作者

回复 #2 chenhao392 的帖子

看得我头晕啊，能不能用中文注释一下呢？？

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

chenhao392

稍有积蓄

论坛徽章:: 1

4楼 [报告]

发表于 2010-01-27 01:22 |只看该作者

哥们，我是觉得在哪见过，就给你找了一下，本人远非高手，你可以试试。。
open FILE, "picture.jpg" or die $!;   #读入一个文件
binmode FILE; #进入 binmode
my ($buf, $data, $n);#定义变量
while (($n = read FILE, $data, 4) != 0) { #这里就是那个read FILEHANDLE, SCALAR, LENGTH,  我猜测的意思是将某一行的字符4个一组读入到$data中.
print "$n bytes read\n";  #打印读入了几个字符
$buf .= $data; #存储$data到$buf中
}
close(FILE);

[ 本帖最后由 chenhao392 于 2010-1-27 01:27 编辑 ]