免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 194347 | 回复: 106

[mooseFS] ▇▇▇mfs权威指南(moosefs)分布式文件系统一站式解决方案(部署,性能测试)不断更新 [复制链接]

论坛徽章:
0
发表于 2010-01-14 12:55 |显示全部楼层
#!/bin/tony

1. 我在性能测试中间遇到些问题,因为我时间有限,所以希望大家一起来测试解决,群策群力。有什么问题请大家及时指出来,因为我也处在一个不断摸索的阶段。
2. mfs不多做介绍,具体细节请参考本版mfs实践文章
http://bbs.chinaunix.net/thread-1643863-1-1.html ,或者baidu,google 关键字  田逸
3. 希望大家能提供更好的存储/文件系统的测试模型,来一起完善本文档。(热烈欢迎大家奉献测试脚本测试用例等)。
4. 希望大家提供生产环境的实际案例,配置环境,脚本,监控机制等等。
5. 希望熟悉代码的朋友去看看mfs内部实现的机制。
6. 特别感谢田逸的文档
http://sery.blog.51cto.com/10037/263515
7. 特别感谢qq群战友:tt,灵犀,流云风,hzqbbc在qq群内对广大爱好者分享宝贵经验。

8. 特别感谢存储专家-《大话存储》的作者:冬瓜头 ,在我进行性能测试的时候,对我进行的指导。
9. 特别感谢qq群战友:高性能架构,CU ID: leo_ss_pku制作更专业更精美的pdf版本: MooseFS权威指南.pdf (3.32 MB, 下载次数: 5568)

论坛徽章:
0
发表于 2010-01-14 14:15 |显示全部楼层

续1

6. 参考文献:
6.1 文献
http://sery.blog.51cto.com/10037/263515  田逸
http://bbs.chinaunix.net/thread-1643863-1-1.html  ltgzs777  
http://www.moosefs.org/  官网
http://bbs.chinaunix.net/thread-1643015-1-2.html   测试工具


6.1  测试数据  

性能测试模型1  
一个不知道名字的哥们的测试结果,我先贴出来,那哥们看到了密我.


小文件性能测试

二级100*100文件夹

创建

列表

删除

单片15k.5
ext3
client单进程

real
0m0.762s
user
0m0.048s
sys
0m0.261s

real
0m0.179s
user
0m0.036s
sys
0m0.125s

real
0m0.492s
user
0m0.036s
sys
0m0.456s

单片15k.5
ext3
client 10并发进程

最长时间:
real
0m0.724s
user
0m0.015s
sys
0m0.123s

最长时间:
real
0m0.057s
user
0m0.006s
sys
0m0.025s

最长时间:
real
0m0.226s
user
0m0.010s
sys
0m0.070s

6chunkserver
cache
client
单进程

real
0m2.084s
user
0m0.036s
sys
0m0.252s

real
0m4.964s
user
0m0.043s
sys
0m0.615s

real
0m6.661s
user
0m0.046s
sys
0m0.868s

6chunkserver
cache
client 10
并发进程

最长时间:
real
0m1.422s
user
0m0.007s
sys
0m0.050s

最长时间:
real
0m2.022s
user
0m0.008s
sys
0m0.108s

最长时间:
real
0m2.318s
user
0m0.008s
sys
0m0.136s

二级1000*1000文件夹

创建

列表

删除

单片15k.5
ext3
client单进程

real
11m37.531s
user
0m4.363s
sys
0m37.362s

real
39m56.940s
user
0m9.277s
sys
0m48.261s

real
41m57.803s
user
0m10.453s
sys
3m11.808s

单片15k.5
ext3
client 10并发进程

最长时间:
real
11m7.703s
user
0m0.519s
sys
0m10.616s

最长时间:
real
39m30.678s
user
0m1.031s
sys
0m4.962s

最长时间:
real
40m23.018s
user
0m1.043s
sys
0m19.618s

6chunkserver
cache
client
单进程

real
3m17.913s
user
0m3.268s
sys
0m30.192s

real
11m56.645s
user
0m3.810s
sys
1m10.387s

real
12m14.900s
user
0m3.799s
sys
1m26.632s

6chunkserver
cache
client 10
并发进程

最长时间:
real
1m13.666s
user
0m0.328s
sys
0m3.295s

最长时间:
real
4m31.761s
user
0m0.531s
sys
0m10.235s

最长时间:
real
4m26.962s
user
0m0.663s
sys
0m13.117s

三级100*100*100文件夹

创建

列表

删除

单片15k.5
ext3
client单进程

real
9m51.331s
user
0m4.036s
sys
0m32.597s

real
27m24.615s
user
0m8.907s
sys
0m44.240s

real
28m17.194s
user
0m10.644s
sys
1m34.998s

单片15k.5
ext3
client 10进程

最长时间:
real
10m22.170s
user
0m0.580s
sys
0m11.720s

最长时间:
real
33m32.386s
user
0m1.127s
sys
0m5.280s

最长时间:
real
33m7.808s
user
0m1.196s
sys
0m10.588s

6chunkserver
cache
client
单进程

real
3m21.720s
user
0m3.089s
sys
0m26.635s

real
9m26.535s
user
0m3.901s
sys
1m11.756s

real
10m51.558s
user
0m4.186s
sys
1m26.322s

6chunkserver
cache
client 10
并发进程

最长时间:
real
1m23.023s
user
0m0.429s
sys
0m3.869s

最长时间:
real
4m10.617s
user
0m0.643s
sys
0m11.588s

最长时间:
real
4m20.137s
user
0m0.649s
sys
0m14.120s

6chunkserver
cache
client 50
并发进程

最长时间:
real
1m26.388s
user
0m0.074s
sys
0m0.679s

最长时间:
real
4m37.102s
user
0m0.132s
sys
0m2.160s

最长时间:
real
4m37.392s
user
0m0.132s
sys
0m2.755s

6chunkserver
cache
client 100
并发进程

最长时间:
real
1m29.338s
user
0m0.062s
sys
0m0.363s

最长时间:
real
4m54.925s
user
0m0.069s
sys
0m1.212s

最长时间:
real
4m35.845s
user
0m0.068s
sys
0m1.640s

6chunkserver
cache
remote
client 10
并发进程

最长时间:
real
4m0.411s
user
0m2.985s
sys
0m12.287s

最长时间:
real
8m31.351s
user
0m4.223s
sys
0m29.800s

最长时间:
real
4m3.271s
user
0m3.206s
sys
0m11.922s

三级100*100*100文件夹

1

2

3

4

5

变更日志/元数据大小

55M左右

60M左右

60M左右

60M左右

60M左右

连续创建耗时

real
4m0.411s
user
0m2.985s
sys
0m12.287s

real
4m12.309s
user
0m3.039s
sys
0m12.899s

real
4m14.010s
user
0m3.418s
sys
0m12.831s

real
4m14.214s
user
0m3.247s
sys
0m12.871s

real
4m14.417s
user
0m3.170s
sys
0m12.948s


注:

单盘多进程性能没有提升,因为都在io wait,甚至增加进程会消耗大量调度时间

MFS多进程时性能会提升,主要性能消耗集中在CPU系统时间。因此实际海量小文件性能要大大强于本地文件系统


性能测试模型2 (感谢 qq群战友 痞子白 提供)
两个Client同时dd测试
数据块1M 文件大小20G
Client1 写:68.4MB/s  读:25.3MB/s
Client2 写:67.5MB/s  读:24.7MB/s
总吞吐:写:135.9MB/s 读:50.0MB/s

写命令:dd if=/dev/zero of=/mfs/test.1 bs=1M count=20000
读命令:dd if=/mfs/test.1 of=/dev/null bs=1M


7. 感谢
田逸
一个不知道名字的哥们(看到请联系我)



8. 附录
8.1  1000 * 1000 * 1 client 脚本
#!/bin/bash
for ((i=0;i<1000;i++))
do
    mkdir ${i}
    cd ${i}
    for ((j=0;j<1000;j++))
      do
        cp /mnt/test ${j}
      done
      cd ..
done
8.2  1000  * 1000  *  ( 100,200 ,1000 client )  脚本
#!/bin/bash
declare -f make_1000_dir_file
cd `pwd`
function make_1000_dir_file {
    start=${1}
    stop=${2}
    for ((i=${start};i<${stop};i++))
    do
        mkdir ${i}
        for ((j=0;j<1000;j++))
        do
            cp /mnt/test ${i}/${j}
        done
    done
}
i=1
while [ ${i} -le 1000 ]
do
    ((n=${i}+1))
    make_1000_dir_file ${i} $ &
    ((i=${i}+1))
done
wait

[ 本帖最后由 shinelian 于 2010-1-20 17:23 编辑 ]

论坛徽章:
0
发表于 2010-01-14 14:16 |显示全部楼层

续2

9. 实际操作案例
9.1 默认的垃圾回收时间是86400,存在一种可能性是垃圾还没回收完,你的存储容量就暴掉了。(案例提供者shinelian

方案1:设置垃圾回收时间,积极监控存储容量。
           经过测试,把垃圾回收时间设置300秒,完全可以正确回收容量

方案2:手动周期性去删除metamfs里的trash目录下的文件(健壮性还有待测试,反正删除后容量是回收了,不晓得有没有什么后遗症。)
           经过测试,貌似没后遗症,有后遗症的同学请在qq群里面联系我。


9.2    mfs 1.6.x的User Guides和FAQ,并和灵犀沟通对文档中不理解的地方,就理解不一致的地方达成一致。MFS1.6.x比1.5.x中有以下的变化:(特别感谢qq群内网友 流云风 和 灵犀 )
     (1)修复1.5.x中在大批量操作时打开文件过多的bug。这个错误也在我们此次测试的时候遇到,报的错误说是打开的文件过多,造成chunker server的链接错误。虽然后来的测试中一直想模拟出来这个问题,但是一直无法模拟出来。在1.6.x中解决此问题,就解决了很大的问题。

      (2)新增加了masterlogger服务器。这是在1.5.x中所没有的,就是做了master服务器的冗余,进一步的加强的master服务器的稳定性。在mfs体系中master是要求最稳定以及性能要求最高的,因此务必保证master的稳定。

      (3)修改1.5.x中存在的对于坏块的修复功能。在mfs1.5.x中遇到chunker坏块校验,错误比较多的是很往往导致master将出现坏块的chunker自动的剔除出去的情况,此次增加了对坏块的修复功能,很方便的进行修复,简化对坏块的处理功能。

      (4)对metadata和changelog的新认识。之前认为changelog记录的是文件的操作,定期的像数据库的日志一样归档到metadata中。发现上面的理解存在误区,真正的是changelog中记录了对文件的操作,metadata记录文件的大小和位置。因此metadata是比较重要的,在进行修复的过程中是采用metadata和最后一次的changelog进行修复的。
      
      (5)MFS文档中明确指出对于内存和磁盘大小的要求。【In our environment (ca. 500 TiB, 25 million files, 2 million folders distributed on 26 million chunks on 70 machines) the usage of chunkserver CPU (by constant file transfer) is about 15-20% and chunkserver RAM usually consumes about 100MiB (independent of amount of data).
The master server consumes about 30% of CPU (ca. 1500 operations per second) and 8GiB RAM. CPU load depends on amount of operations and RAM on number of files and folders.】
      
      (6)指出了在测试的过程中多个chunker并不影响写的速度,但是能加快读的速度。在原来的基础上增加一个chunker时,数据会自动同步到新增的chunker上以达到数据的平衡和均衡。

9.3 mfs1.5.x 数据恢复实例 (案例分享 : QQ群战友 Xufeng
            其实很简单,就是mfsrestore, 然后启动服务的时候,并有任何提示信息,进程启不来,其实是配置文件放PID的目录会被删掉,重建这个目录,然后赋予权限就可以了,我已经做过1.5到1.6的升级,还可以.
            详情见 Xufeng blog http://snipt.net/iamacnhero/tag/moosefs


10.  生产环境案例(大家踊跃提供,不断更新中~~~~~~)
http://www.gaokaochina.com  田逸



[ 本帖最后由 shinelian 于 2010-1-16 14:44 编辑 ]

论坛徽章:
0
发表于 2010-01-14 14:18 |显示全部楼层

续3

本帖最后由 shinelian 于 2010-02-22 12:41 编辑

11  web gui 监控

01_info.png


02_servers.png

03_disks.png


04_exports.png


05_mounts.png


06_operations.png


07_master_charts.png


08_server_charts.png
gui_info.jpg
gui_most.jpg
gui_master_info.jpg
gui_server.jpg

论坛徽章:
0
发表于 2010-01-14 14:19 |显示全部楼层

续4

12.  mfs官方关于1.6.x 的介绍   翻译人(QQ群战友:Cuatre )


View on new features of next release v 1.6 of Moose File System
关于对MFS(Moose File System)下一个发布版本V1.6新特性的一些看法
We are about to release a new version of MooseFS which would include a large number of new features and bug fixes. The new features are so significant that we decided to release it under 1.6 version. The newest beta files are in the GIT repository.
我们将要发布MFS一个最新版本,该版本修复了大量的bug,同时也包含了大量的新特性。这些新特性非常重要和有特色,我们决定在1.6版本进行发布。最新的beta文件你可以GIT的知识库找得到。
The key new features/changes of MooseFS 1.6 would include:
MooseFS 1.6的主要特性及变化包括:
General:
Removed duplicate source files.
移除了复制源文件
Strip whitespace at the end of configuration file lines.
配置文件行的末尾将为空白
Chunkserver:
Chunkserver
Rewritten in multi-threaded model.
重写了多线成模式
Added periodical chunk testing functionality (HDD_TEST_FREQ option).
增加了定期chunk测试功能(HDD_TEST_FREQ选项)
New -v option (prints version and exits).
新的-v选项(显示版本)

Master:
Added "noowner" objects flag (causes objects to belong to current user).
增加了"noowner"对象标记(可以使对象属于当前用户)
Maintaining `mfsdirinfo` data online, so it doesn't need to be calculated on every request.
保持‘mfsdirinfo’数据在线,这样就不需要求每一个请求都进行运算。
Filesystem access authorization system (NFS-like mfsexports.cfg file, REJECT_OLD_CLIENTS option) with ro/rw, maproot, mapall and password functionality.
文件系统访问认证系统(类似于NFS的mfsexports.cfg文件,REJECT_OLD_CLIENTS选项),有ro/rw, maproot, mapall及密码功能
New -v option (prints version and exits).
新的-v选项(显示版本)

Mount:
Rewritten options parsing in mount-like way, making possible to use standard FUSE mount utilities (see mfsmount( manual for new syntax). Note: old syntax is no longer accepted and mountpoint is mandatory now (there is no default).
重写选项将采用类似于挂载的解析方式,使用标准的FUSE挂载工具集将成为可能(参见新的mfsmount(语法手册)。注:旧的语法现在将不再被支持,而设置挂载点则是必须的。(非默认选项)
Updated for FUSE 2.6+.
升级到FUSE 2.6版本以上
Added password, file data cache, attribute cache and entry cache options. By default attribute cache and directory entry cache are enabled, file data cache and file entry cache are disabled.
增加了密码,文件数据缓存,属性缓存及目录项选项。默认情况下,属性缓存及目录项缓存是开启的,而文件数据缓存和文件项输入缓存则是关闭的
opendir() no longer reads directory contents- it's done on first readdir() now; fixes "rm -r" on recent Linux/glibc/coreutils combo.
opendir()函数将不再读取目录内容-读取目录内容现在将由readdir()函数完成;修复了当前Linux/glibc/coreutils组合中的‘rm -r’命令
Fixed mtime setting just before close() (by flushing file on mtime change); fixes mtime preserving on "cp -p".
修复了在close()前的mtime设置(在mtime变化的时候刷新文件)
Added statistics accessible through MFSROOT/.stats pseudo-file.
增加了表示访问吞吐量的统计伪文件MFSROOT/.stats
Changed master access method for mfstools (direct .master pseudo-file replaced by .masterinfo redirection); fixes possible mfstools race condition and allows to use mfstools on read-only filesystem.
对于mfstools改变了主要的访问路径(直接)

Tools:
Units cleanup in values display (exact values, IEC-60027/binary prefixes, SI/decimal prefixes); new options: -n, -h, -H and MFSHRFORMAT environment variable - refer to mfstools( manual for details).
在单元值显示方面进行一致化(确切值,IEC-60027/二进制前缀, SI/十进制前缀);新的选项:-n,-h,-H以及可变的MFSHRFORMAT环境----详细参见mfstools(手册
mfsrgetgoal, mfsrsetgoal, mfsrgettrashtime, mfsrsettrashtime have been deprecated in favour of new "-r" option for mfsgetgoal, mfssetgoal, mfsgettrashtime, mfssettrashtime tools.
我们推荐使用带新的“-r”选项的mfsgetgoal, mfssetgoal, mfsgettrashtime, mfssettrashtime工具,而不推荐mfsrgetgoal, mfsrsetgoal, mfsrgettrashtime, mfsrsettrashtime工具。(注意前后命令是不一样的,看起来很类似)
mfssnapshot utility replaced by mfsappendchunks (direct descendant of old utility) and mfsmakesnapshot (which creates "real" recursive snapshots and behaves similar to "cp -r".
mfssnapshot工具集取代了mfsappendchunks(老工具集的后续版本)和mfsmakesnapshot(该工具能够创建“真”的递归快照,这个动作类似于执行“cp -r”)工具
New mfsfilerepair utility, which allows partial recovery of file with some missing or broken chunks.
新的mfs文件修复工具集,该工具允许对部分丢失及损坏块的文件进行恢复
CGI scripts:
First public version of CGI scripts allowing to monitor MFS installation from WWW browser.
第一个允许从WWW浏览器监控MFS安装的CGI脚本发布版本


13. mfs官方FAQ(TC版)
What average write/read speeds can we expect?
The raw reading / writing speed obviously depends mainly on the performance of the used hard disk drives and the network capacity and its topology and varies from installation to installation. The better performance of hard drives used and better throughput of the net, the higher performance of the whole system.

In our in-house commodity servers (which additionally make lots of extra calculations) and simple gigabyte Ethernet network on a petabyte-class installation
on Linux (Debian) with goal=2 we have write speeds of about 20-30 MiB/s and reads of 30-50MiB/s. For smaller blocks the write speed decreases, but reading is not much affected.


Similar FreeBSD based network has got a bit better writes and worse reads, giving overall a slightly better performance.

Does the goal setting influence writing/reading speeds?

Generally speaking,
it doesn’t. The goal setting can influence the reading speed only under certain conditions. For example, reading the same file at the same time by more than one client would be faster when the file has goal set to 2 and not goal=1.


But the situation in the real world when several computers read the same file at the same moment is very rare; therefore, the goal setting has rather little influence on the reading speeds.

Similarly, the writing speed is not much affected by the goal setting.


How well concurrent read operations are supported?

All read processes are parallel - there is no problem with concurrent reading of the same data by several clients at the same moment.

How much CPU/RAM resources are used?

In our environment (ca. 500 TiB, 25 million files, 2 million folders distributed on 26 million chunks on 70 machines) the usage of chunkserver CPU (by constant file transfer) is about 15-20% and chunkserver RAM usually consumes about 100MiB (independent of amount of data).
The master server consumes about 30% of CPU (ca. 1500 operations per second) and 8GiB RAM. CPU load depends on amount of operations and RAM on number of files and folders.

Is it possible to add/remove chunkservers and disks on fly?

You can add / remove chunkservers on the fly. But mind that it is not wise to disconnect a chunkserver if there exists a chunk with only one copy (marked in orange in the CGI monitor).
You can also disconnect (change) an individual hard drive. The scenario for this operation would be:


  • Mark the disk(s) for removal
  • Restart the chunkserver process
  • Wait for the replication (there should be no “undergoal” or “missing” chunks marked in yellow, orange or red in CGI monitor)
  • Stop the chunkserver process
  • Delete entry(ies) of the disconnected disk(s) in 'mfshdd.cfg'
  • Stop the chunkserver machine
  • Remove hard drive(s)
  • Start the machine
  • Start the chunkserver process

If you have hotswap disk(s) after step 5 you should follow these:
  • Unmount disk(s)
  • Remove hard drive(s)
  • Start the chunkserver process

If you follow the above steps work of client computers would be not interrupted and the whole operation would not be noticed by MooseFS users.

My experience with clustered filesystems is that metadata operations are quite slow. How did you resolve this problem?

We have noticed the problem with slow metadata operations and we decided to cache file system structure in RAM in the metadata server. This is why metadata server has increased memory requirements.


When doing df -h on a filesystem the results are different from what I would expect taking into account actual sizes of written files.

Every chunkserver sends its own disk usage increased by 256MB for each used partition/hdd, and a sum of these master sends to the client as total disk usage. If you have 3 chunkservers with 7 hdd each, your disk usage will be increased by 3*7*256MB (about 5GB). Of course it's not important in real life, when you have for example 150TB of hdd space.

There is one other thing. If you use disks exclusively for MooseFS on chunkservers df will show correct disk usage, but if you have other data on your MooseFS disks df will count your own files too.

If you want to see usage of your MooseFS files use 'mfsdirinfo' command.


Do chunkservers and metadata server do their own checksumming?

Yes there is checksumming done by the system itself. We thought it would be CPU consuming but it is not really. Overhead is about 4B per a 64KiB block which is 4KiB per a 64MiB chunk (per goal).

What sort of sizing is required for the Master  server?
The most important factor is RAM of mfsmaster machine, as the full file system structure is cached in RAM for speed. Besides RAM mfsmaster machine needs some space on HDD for main metadata file together with incremental logs.

The size of the metadata file is dependent on the number of files (not on their sizes). The size of incremental logs depends on the number of operations per hour, but length (in hours) of this incremental log is configurable.

1 million files takes approximately 300 MiB of RAM. Installation of 25 million files requires about 8GiB of RAM and 25GiB space on HDD.


When I delete files or directories the MooseFS size doesn’t change. Why?

MooseFS is not erasing files immediately to let you revert the delete operation.

You can configure for how long files are kept in trash and empty the trash manually (to release the space). There are more details here:
http://moosefs.com/pages/userguides.html#2[MB1] in section "Operations specific for MooseFS".

In short - the time of storing a deleted file can be verified by the
mfsgettrashtime command and changed with mfssettrashtime.


When I added a third server as an extra chunkserver it looked like it started replicating data to the 3rd server even though the file goal was still set to 2.

Yes. Disk usage ballancer uses chunks independently, so one file could be redistributed across all of your chunkservers.

Is MooseFS 64bit compatible?Yes!

Can I modify the chunk size?

File data is divided into fragments (chunks) with a maximum of 64MiB each. The value of 64 MiB is hard coded into system so you cannot modify its size. We based the chunk size on real-world data and it was a very good compromise between number of chunks and speed of rebalancing / updating the filesystem. Of course if a file is smaller than 64 MiB it occupies less space.

Please note systems we take care of enjoy files of size well exceeding 100GB and there is no chunk size penalty noticeable.

How do I know if a file has been successfully written in MooseFS?

First off, let's briefly discuss the way the writing process is done in file systems and what programming consequences this bears. Basically, files are written through a buffer (write cache) in all contemporary file systems. As a result, execution of the "write" command itself only transfers the data to a buffer (cache), with no actual writing taking place. Hence, a confirmed execution of the "write" command does not mean that the data has been correctly written on a disc. It is only with the correct performance of the "fsync" (or "close" command that all data kept in buffers (cache) gets physically written. If an error occurs while such buffer-kept data is being written, it could return an incorrect status for the "fsync" (or even "close", not only "write" command.
The problem is that a vast majority of programmers do not test the "close" command status (which is generally a mistake, though a very common one). Consequently, a program writing data on a disc may "assume" that the data has been written correctly, while it has actually failed.
As far as MooseFS is concerned – first, its write buffers are larger than in classic file systems (an issue of efficiency); second, write errors may be more frequent than in case of a classic hard drive (the network nature of MooseFS provokes some additional error-inducing situations). As a consequence, the amount of data processed during execution of the "close" command is often significant and if an error occurs while the data is being written, this will be returned in no other way than as an error in execution of the "close" command only.
Hence, before executing "close", it is recommended (especially when using MooseFS) to perform "fsync" after writing in a file and then check the status of "fsync" and – just in case – the status of "close" as well.
NOTE! When "stdio" is used, the "fflush" function only executes the "write" command, so correct execution of "fflush" is not enough grounds to be sure that all data has been written successfully – you should also check the status of "fclose".
One frequent situation in which the above problem may occur is redirecting a standard output of a program to a file in "shell". Bash (and many other programs) does not check the status of "close" execution and so the syntax of the "application > outcome.txt" type may wrap up successfully in "shell", while in fact there has been an error in writing the "outcome.txt" file. You are strongly advised to avoid using the above syntax. If necessary, you can create a simple program reading the standard input and writing everything to a chosen file (but with an appropriate check with the "fsync" command) and then use "application | mysaver outcome.txt", where "mysaver" is the name of your writing program instead of "application > outcome.txt".
Please note that the problem discussed above is in no way exceptional and does not stem directly from the characteristics of MooseFS itself. It may affect any system of files – only that network type systems are more prone to such difficulties. Technically speaking, the above recommendations should be followed at all times (also in case of classic file systems).



Janusz, [MB1]tu trzeba będzie zrobić prawidłowy link


[ 本帖最后由 shinelian 于 2010-1-15 18:07 编辑 ]

论坛徽章:
0
发表于 2010-01-14 14:49 |显示全部楼层
沙发!!!

小弟 这请教一些问题 :

1. 在 mfs 中 数据存储服务器 如果 满了 错误处理流程能介绍下吗?

2. 在分布存储中 一些数据能否 支持 一些数据结构 。比如:树,链表什么

我这使用过 简单的 使用 hadoop 有小几月,发现使用并不理想对于 数据分析来说

论坛徽章:
0
发表于 2010-01-14 14:54 |显示全部楼层

回复 #6 liukaiyi 的帖子

1. 首先mfs可以在线扩容,你只要监控存储的使用率,比如到了80%就报警,这个可以是预防,错误处理流程等大家测试后再讨论。

2. 这个是通用的文件系统(你可以把它想象成为本地的ext3),不包括数据结构。

另,你可以加入
0. 欢迎加入qq群102082446 ,专门讨论分布式文件系统,通关密码:i love cuer!

貌似有哥们在研究hadoop

论坛徽章:
0
发表于 2010-01-14 14:59 |显示全部楼层

回复 #7 shinelian 的帖子

谢谢
找个时间 先自己搭建 mfs 跟进中。。。。。

论坛徽章:
0
发表于 2010-01-14 15:13 |显示全部楼层
不错啊,写的相当的详细,有空要来自己测试下的.

论坛徽章:
0
发表于 2010-01-15 12:40 |显示全部楼层
自己顶,大家多多关注。
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP