Ping a Specific Port

Question

Benoît

Asked: 2009-06-11 02:01:15 +0800 CST2009-06-11 02:01:15 +0800 CST 2009-06-11 02:01:15 +0800 CST

Block-level deduplication on Linux

772

NetApp provides block-level deduplication (ASIS). Do you know any filesystem (even FUSE-based) on Linux (or OpenSolaris, *BSD) that provides the same functionnality ?

(I'm not interested in false deduplication like hardlinks).

10 Answers

Voted

3dinfluence · Answer 1 · 2009-06-11T06:13:46+08:00

3dinfluence

2009-06-11T06:13:46+08:002009-06-11T06:13:46+08:00

Deduplication is coming to ZFS on OpenSolaris but that functionality is not currently available.

It was prototyped by Jeff Bonwick and Bill Moore this past winter and they are working on integrating it this summer. So it should be available in the next release of OpenSolaris or sooner if you want to play around with the development branch.

7

MV. · Answer 2 · 2009-09-29T03:36:41+08:00

Best Answer

MV.

2009-09-29T03:36:41+08:002009-09-29T03:36:41+08:00

Check lessFS, data-deduplication filesystem, for Linux. It is still in beta but you can try it out:

http://www.lessfs.com/

Regards,

MV

6

Matt Simmons · Answer 3 · 2009-06-11T02:17:42+08:00

Matt Simmons

2009-06-11T02:17:42+08:002009-06-11T02:17:42+08:00

For people who may be unfamiliar with data deduplication, it is a technique whereby data is analyzed at the file (or block, I suppose) level, and where identical files/blocks throughout the file system are replaced with a smaller token. This has the effect of greatly shrinking the effective size on disk. It could be considered a form of copy-on-write. Read the wiki page on it.

There is no filesystem that I have heard of in Linux to do dedup, file or block level. Such a beast would be handy, although pretty processor intensive.

4

jlliagre · Answer 4 · 2009-12-18T03:00:49+08:00

jlliagre

2009-12-18T03:00:49+08:002009-12-18T03:00:49+08:00

Deduplication is now available with ZFS on OpenSolaris (build 128a and newers).

4

Aaron · Answer 5 · 2010-05-09T05:43:28+08:00

Aaron

2010-05-09T05:43:28+08:002010-05-09T05:43:28+08:00

A year later, but here is a solution for OpenBSD called Epitome. Provided it's liberal licensing, it could very well make it into the Linux kernel.

3

user37619 · Answer 6 · 2010-03-14T01:05:58+08:00

user37619

2010-03-14T01:05:58+08:002010-03-14T01:05:58+08:00

I just posted a project that I have been working on that does inline deduplication. You can take a look at it here if you are interrested. It is based on fuse and runs on linux.

1

James · Answer 7 · 2009-06-25T12:44:07+08:00

James

2009-06-25T12:44:07+08:002009-06-25T12:44:07+08:00

I don't know of any free implementations of dedup for Linux. I have seen some storage vendors recommending using a HSM(hierarchical storage management) system with a VTL(Virtual storage Library) which does dedup.

You could also consider an Occarina like system which is not transparent but can provide better results than dedup.

0

Tudor · Answer 8 · 2010-04-17T05:20:31+08:00

Tudor

2010-04-17T05:20:31+08:002010-04-17T05:20:31+08:00

so ... no news about deduplication on Linux? opendedup might be a choice but giving the java platform it runs on, i don't wanna get headaches. I have tried it yes, but this java machine and the rest are not getting very well with my needs of storage response times and safety.

0

Znik · Answer 9 · 2014-06-25T05:05:57+08:00

Deduplication option is available under Linux, on filesystems BTRFS and ZFS. BTRFS is natively developed under linux and has off-line deduplication tool. I aren't thinking 'offline', you must umount fs. Offline means, actively writed data isn't deduplicated. But later you run tool for deduplicate thinks stored now. Actually probably tool is in beta. Other way is inside ZFS. Avaliable as FUSE and natively: http://zfsonlinux.org/ . This do online deduplication, unfortunately this slow down writes because all must be calculated on the fly. You can online off and on this behavior. After you off deduplication, all deduplicated data will be still stored as deduplicated. New writes will be stored as 'duplicated'. If you want deduplicate that data in the future, you must turn on deduplication and rewrite all 'duplicated' files.

See doc available on the page. For speed up writings and readings, you can add faster devices to the storage pool (specially SDD drives or maybe faster flash USB, pay attention on device reliability).

Antoine Benkemoun · Answer 10 · 2009-06-11T02:02:28+08:00

Antoine Benkemoun

2009-06-11T02:02:28+08:002009-06-11T02:02:28+08:00

DRBD does just that and does it really well to ! Can do Master/Slave or Master/Master :-)

-2

Block-level deduplication on Linux

Ping a Specific Port

What port does SFTP use?

Resolve host name from IP address

How can I sort du -h output by size

Command line to list users in a Windows Active Directory group?

What's the command-line utility in Windows to do a reverse DNS look-up?

How to check if a port is blocked on a Windows machine?

What port should I open to allow remote desktop?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?