Python Find String in File
File operations are a common task for any developer. One task you might encounter is searching for a specific string or pattern in a file. This can be useful when working with log files, quick replacements, etc.
In this tutorial, we will learn how to use various methods and techniques to search for a specific string or pattern in a given file. We will also cover the pros and cons of each method to see which is most suitable for which use case.
Method 1 - Load File into Memory
One simplest method is reading the entire target file into memory and then using the resulting data to search.
This is an excellent method if the file size is small and you do not have encoded data. Hence, this method should be easy to implement and use if you are working with a small plain text file.
For example, suppose we have a text file containing servers and IP addresses as shown:
# Servers and IPs
## Database Servers
jHs8d2b: 192.168.1.10
k92mDh1: 192.168.1.11
## Web Servers
qMw8pT0: 192.168.2.20
bVz4gR5: 192.168.2.21
uYs3aH6: 192.168.2.22
## Cache Servers
fLr0vE2: 192.168.3.30
In this case, we have the server name and the corresponding IP address.
We can read the file into memory and search for the string bVz4gR5
as shown:
filename = 'servers.txt'
string_to_search = 'bVz4gR5'
with open(filename, 'r') as file:
contents = file.read()
if string_to_search in contents:
print(f"'{string_to_search}' found in {filename}")
else:
print(f"'{string_to_search}' not found in {filename}")
In the code above, we define the filename we wish to search. We also define the target string or pattern that we are looking for.
The next step involves reading the file’s contents into memory and using the in
operator in Python to locate where the string to search is in the file.
Finally, we use the if...else
blocks to search for a match and print the corresponding result.
Output:
'bVz4gR5' found in servers.txt
Method 2 - Read File Line By Line
For a more memory-efficient method, we can read the target file line by line and search for the matching string in each. This is more efficient than the previous example, especially when working with larger files.
An example demonstration is as shown:
filename = 'servers.txt'
string_to_search = 'bVz4gR5'
with open(filename, 'r') as file:
for line in file:
if string_to_search in line:
found = True
break
if found:
print(f"'{string_to_search}' found in {filename}")
else:
print(f"'{string_to_search}' not found in {filename}")
Output:
'bVz4gR5' found in servers.txt
Method 3 - Using Regular Expressions
There is no more powerful tool for pattern matching than regular expressions. Therefore, we can use the re
module in Python to create powerful regular expressions that can match the target patterns in a file.
This may be a little overkill for a simple search task, but once you master it, you won’t want to return as it can work on almost any scenario.
An example is as shown:
import re
filename = 'servers.txt'
pattern_to_search = r'bVz4gR5.*:' # Search for 'bVz4gR5' followed by ':' in the same line.
with open(filename, 'r') as file:
contents = file.read()
if re.search(pattern_to_search, contents):
print(f"Pattern '{pattern_to_search}' found in {filename}")
else:
print(f"Pattern '{pattern_to_search}' not found in {filename}")
Output:
Pattern 'bVz4gR5.*:' found in servers.txt
Method 4 - Using MMAP Module
In Python, we have access to the mmap
module that allows us to memory-map a specific file.
Memory mapping refers to the process of creating a direct byte-for-byte correspondence between a region in memory and a file or a section of a file.
Memory mapping is high-speed and efficient, especially with large files.
An example usage is as shown below:
import mmap
filename = 'servers.txt'
string_to_search = 'bVz4gR5'
with open(filename, 'r') as file:
# Memory-map the file
mmapped_file = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ)
# Search for the string
found = mmapped_file.find(string_to_search.encode()) != -1
if found:
print(f"'{string_to_search}' found in {filename}")
else:
print(f"'{string_to_search}' not found in {filename}")
Output:
'bVz4gR5' found in servers.txt
And there you have it.
Conclusion
This tutorial taught us how to work with various Python methods and tools to search for specific strings or patterns in a given file. We also covered the pros and cons of each method, allowing you to choose which works best for you quickly.
See you in the next one!