Python, Comparing Two Files
Suppose I have two (huge) files. One contains a list of words. Another contains a list of words followed by some numbers; i.e., the format is like this: file 1: word1 word2 ..
Solution 1:
This will only work if the files are in the same order, and the words in file 1 are are purely a subset of words in file 2:
def gen_overlap(file1, file2):
for word in file1:
line = file2.read()
while word not in line:
line = file2.read()
yield line
If they fail to meet either of those conditions, the best method is to create a set
of all of the words:
gen_overlap(file1, file2):
word_set = set(line.split() for line in file1)
for line in file2:
if line.split()[0] in word_set:
yield line
Solution 2:
Use this :-
def file_comp(a_file,b_file):
with open(a_file,'r') as file1,open(b_file,'r') as file2:
read1 = file1.read()
read2 = file2.read()
return([i for i in read2.split('\n') if i.split(" ")[0] in read1.split('\n')])
print(file_comp('file_1.txt','file_2.txt'))
Post a Comment for "Python, Comparing Two Files"