I want to run this command for all files in a directory.
tesseract /home/kong/Documents/input/248.jpg stdout --psm 1 --oem 1 --dpi 300 tsv >/home/kong/Documents/input/ocr_output/input/248.tsv
The input and output should have same number like 248.jpg
and 248.tsv
. I tried writing a python script and it is causing delimiter issues.
Can someone help me with this ? I am bash newbie.
This is the python script I wrote
comm = shlex.split(command)
out_dir = '/home/kong/Documents/input/ocr_output/input'
for file in tqdm(files):
base_name = os.path.basename(file)
number = base_name.split('.')[0]
out_path = '>' + out_dir + '/' + number + '.tsv'
comm[1] = file
comm[-1] = out_path
# tsv = number + '.tsv'
with open(out_path, 'w') as f:
subprocess.run(comm, shell=True, stdout=f)
Try this:
Just as an alternative, you can use this script with Python 3.5 or higher.