I often find it convenient to edit a document in one file format while storing it in another format. For example, a document can be written down in markdown, converted to a word file and edited there. And I am hoping to extract the edits and applied them back to original markdown file.
So, I would have A.md
, converted to A.docx
(e.g. using pandoc
), edited to B.docx
, and somehow apply the change set between A.docx
and B.docx
to A.md
.
My question is:
Is there a reliable/automatic way to transfer the edits in word (text edits) back to the text-based files?
Of course, I can just convert B.docx
back to a markdown B.md
and overwrite A.md
with B.md
. But the conversion process often introduces irreversible changes so that the loop A.md
=> A.docx
=> A2.md
will produce a different A2.md
than the original markdown file A.md
. Some effects will be added/lost (due to different newlines, fonts, formatting etc.). And I'd like to avoid such loss, and keep the final document as close to the original file A.md
as possible.
Would it be possible to use diff
/patch
to do the following conversions in bash scripts:
A.md
=>A.docx
=>A2.md
A.docx
=> (edits)B.docx
B.docx
=>B.md
diff B.md A2.md
somehow to get a portable patchapply the patch file on
A.md
(instead ofA2.md
)
I have limited experience in using git diff
, but not much with directly using diff/patch
. And I was wondering if someone can help explain the command sequence needed for such "transfer" of differences.
The bash syntax for what you ask (steps 4,5) is:
I think you may have the files reversed in the diff? And keep in mind patch over-writes A.md so make a copy if you need the original.
In any case, I am skeptical you will be successful with this approach.
To my mind, the Word editor introduces too much non-determinism for any automated conversion to be considered "reliable". Even if you get a script working, you could wind up having to repair it with every update of your Word editor.