According to The First Google Result for "ZFS Deduplication"
...
What to dedup: Files, blocks, or bytes?
...
Block-level dedup has somewhat higher overhead than file-level dedup when whole files are duplicated, but unlike file-level dedup, it handles block-level data such as virtual machine images extremely well.
...
ZFS provides block-level deduplication
...
According to Wikipedia's ZFS Article
ZFS uses variable-sized blocks of up to 128 kilobytes. The currently available code allows the administrator to tune the maximum block size used as certain workloads do not perform well with large blocks. If data compression (LZJB) is enabled, variable block sizes are used. If a block can be compressed to fit into a smaller block size, the smaller size is used on the disk to use less storage and improve IO throughput (though at the cost of increased CPU use for the compression and decompression operations).
I want to make sure I understand this correctly.
Assuming compression is off
If I a randomly filled file of 1GB, then I write a second file that is the same except half way through, I change one of the bytes. Will that file be deduplicated (all except for the changed byte's block?)
If I write a single byte file, will it take a whole 128 kilobytes? If not, will the blocks get larger in the event the file gets longer?
If a file takes two 64kilobyte blocks (would this ever happen?), then would an identical file get deduped after taking a single 128 kilobyte block
If a file is shortened, then part of its block would have been ignored, perhaps the data would not be reset to 0x00 bytes. Would a half used block get deduped?
ZFS deduplication works on blocks (recordlength) it does not know/care about files. Each block is checksummed using sha256 (by default changeable). If the checksum matches an other block it will just reference the same record and no new data will be written. One problem of deduplication with ZFS is that checksums are kept in memory so large pools will require a lot of memory. So you should only apply reduplication when using large record length
Assuming recordlength 128k
Yes only one block will not be duplicated.
128k will be allocated, if the file size grows above 128k more blocks will be allocated as needed.
A file will take 128k the same file will be deduplicated
If the exact same block is found yes
The variable sized blocks of ZFS are as Yavor mentioned already should not be confused randomized variable sized chunking, also called content-defined chunking or Rabin fingerprinting. Here is a small talk describing the differences.
ZFS used static, but configurable block sizes.