Let's say I have a software raid1 array from two partitions. I get one HDD out of the computer, and put it into another computer, and copy some data to the mirrored partiton.
What happens when I put the HDD back into it's original place? Will the new data get mirrored to the other drive? If yes how does the controller check/notice that there is new data on the drive? Surely it can't check the whole partition on every boot.
Also, what happens when I delete a file in the same scenario as above? Will it get deleted back from the other drive, or will it get copied back?
This is on an Ubuntu machine, and by getting the drive out, I mean that the computer is completely turned off while doing this, and is only turned back on the when the drive is back in it's place.
If the raid controller does not recognize the change because the power is off while doing this, is there a way to instruct it to reconstruct the array? Like, let's say that raid is constructed from /dev/sda1 and /dev/sdb1, I turn off the computer, and pull out the sdb drive, copy data to it, put it back, and now I want to instruct the controller to reconstruct the array, using /dev/sdb1 "as a master".
When you disconnect a drive from an active RAID array, it will see the drive as failed within its configuration. When you insert a new drive (or the same drive), it will see it as "new" to the array and rebuild the contents from the known good remaining drive, so any changes you made to the contents will be overwritten. If you do this to a drive while the array is offline, the changes you made will upset the checksums that the array controller uses to track changes, and again, it will see the drive as failed and attempt a rebuild.
If you want to copy files into your array, you'll need to do so through the controller (regardless of whether it's software or hardware).
Is this a Linux or Windows question? What is your specific implementation?
Usually, the removed drive is failed, and when adding it back, you'll have to "unfail" it. This in turn usually means that the failed disk is re-initialized with all the data from the working disk. So in essence, all the changes made to the disk you removed will probably be lost in such a situation
The two answers about it being marked failed and rebuilding are correct, and hopefully this is what will happen. That's the best-case scenario.
The other possibility is that the software does not notice, and then it'll still think the drives are in sync. (For example, this could happen if you did this stunt with the power off) The end result will most likely be corruption, and the only fix will be to format and restore from backup.
Remember, RAID works at the disk level, it doesn't know anything about the filesystem on top. Just a bunch of sectors. When the filesystem requests block 10, the RAID layer knows that block 10 is stored on block 10 of both disk1 and disk2. Somehow, it picks one disk or the other and reads block 10. Except because of your modifying the disks behind its back, block 10 on disk1 and disk2 is different. Oops. You can expect a mix of disk1 and disk2 on a per-block basis, including blocks used to store filesystem metadata.
Fixing the mess
I suggest your best bet to recover from this, given that format and restore from backup is not an option:
(a) Immediately image both drives. Backups are important. Optionally, you may want to only work on the copies.
(b) If the array has not been in read/write mode after this mistake, just pull the modified drive. Rebuild with a new, blank drive.
(c) If the array has been in read/write mode, pick a drive and drop it out the array. Rebuild onto a new drive.
(d) If you completely don't care which drive, just (replacing X with your array number, of course): This forces a resync.
(e) Force a fsck on the now rebuilt array.
(f) Do whatever you can to verify your data. For example, run debsums to check OS integrity, supplying all needed package files for things that don't have MD5 sums.
Note that the drive needs to be blank, or at least all RAID info wiped from it, otherwise the rebuild won't work right.
In general, when you pull a disk out of a RAID-1 array while the array is active, then the array is considered out-of-sync.
If you re-insert the removed disk, it will go though approximately the same process as it would if you inserted a completely new disk -- the contents of the active drive will be copied block-by-block until the new drive is an exact duplicate, and then the array will go back to it's "normal" operating condition as a live RAID-1 array.
Presumably you're asking whether the array will detect that this is the drive that was recently removed and somehow short-cut the synchronization process. The answer to that is no. The controller will have to re-copy everything.
Note that if the array is not online when the drive is removed, you can usually safely re-attach the drive before bringing the array online without having to re-synchronize.
Your array will become corrupted, as the filesystems in both disks will have different data, and the first time the system tries to write a new file in one of the partitions of the disk, it will mess up not only the filesystem's tables, but the data in it too.
If you have drive a and b in an mdadm raid 1 and you
1) power off the server A 2) Put drive B in server B 3) Using mdadm, write to drive B in server B 4) Power off B 5) Boot server A with A and B.
The most likely thing to happen is it will notice the A and B checksums are different and start the array with B and A will be considered faulty - since B's checksum will be more recent.
You would then add A to the array, and B will be copied over A.
I don't know why you'd want to do this though.