Python Convert String to Binary
Strings are a fundamental and essential building block in Python and other programming languages. In Python, a string refers to an immutable sequence of Unicode characters. By default, we represent a string enclosed in single ''
, double ""
or triple '''
or """
quotes.
Binary, conversely, is a base-2 number system that consists of 0’s and 1’s. Binary is an essential data format as it’s what computers understand.
When working with Python programs, you might encounter instances where you need to convert a string type into its binary representation.
In this tutorial, we will dive into the workings of Python built-in methods and techniques to learn how to convert a string into a binary.
Method 1 - Using bytearray
+ bin
The first method we can use is to convert the input string into a bytearray
object. The bytearray
data type in Python is a built-in data type that represents a mutable sequence containing bytes in the range of 0-255.
To convert a string into a bytearray
object, we use the bytearray()
method as shown:
bytearray(string, encoding)
This should return the string as a sequence of bytes.
For example:
>>> string = "Hello, world!"
>>> print(bytearray(string, 'utf-8'))
bytearray(b'Hello, world!')
Once we have converted the string into bytes, we can use a for-loop to iterate over each byte and use the bin()
method on each byte to convert it into its binary representation.
Finally, we can append the resulting binary representation into another list:
The code is as shown below:
>>> string = "Hello, world!"
>>> byte_array = bytearray(string, 'utf-8')
>>> bin_list = []
>>> for byte in byte_array:
... bin_rep = bin(byte)
... bin_list.append(bin_rep)
...
>>> print(bin_list)
Output:
['0b1001000', '0b1100101', '0b1101100', '0b1101100', '0b1101111', '0b101100', '0b100000', '0b1110111', '0b1101111', '0b1110010', '0b1101100', '0b1100100', '0b100001']
As you can see from the output above, we quickly convert the string Hello, world!
into its binary representation and store that result as a list.
How to Remove the 0b
Prefix
As you can notice from the example above, the resulting values contain a prefix of 0b,
which denotes that the number is represented as a binary and not as a decimal value.
However, since we already know this, it becomes unnecessary and difficult to read. We can eliminate it by slicing the binary number and starting with index 2 on the binary string.
Similarly, we can extend this functionality and use the join()
method to join the binary representation as a single string.
>>> for byte in byte_array:
... bin_rep = bin(byte)
... bin_list.append(bin_rep[2:])
...
>>> print(' '.join(bin_list))
Output:
1001000 1100101 1101100 1101100 1101111 101100 100000 1110111 1101111 1110010 1101100 1100100 100001
Method 2 - Using the Format and Bytearray
We can also use Python’s bytearray() method to convert a given string into a byte object. The resulting object can then represent each string character as a byte.
Next, we can call the format(x, 'b')
method to convert the resulting byte object to it’s binary representation.
An example is as shown below:
>>> string = "GeekBits"
>>> result = ' '.join(format(x, 'b') for x in bytearray(string, 'utf-8'))
>>> print(result)
In the code above, we use the bytearray(string, 'utf-8')
method to convert the string into a sequence of bytes.
Next, we use the generator expression format(x, 'b') for x in bytearray(...)
to convert each byte into its binary form.
Output:
1000111 1100101 1100101 1101011 1000010 1101001 1110100 1110011
Method 3 - Using Python Ord and Format Methods
If you are unfamiliar, we have access to the old() function in Python, which allows us to get the Unicode representation of an input character.
We can use this function instead of the bytearray()
method to convert the characters of the input string into their Unicode values.
Finally, we can use the format()
method to convert them into binary and join()
to combine them into a single string, as we did in the previous example.
>>> string = "GeekBits"
>>> result = ' '.join(format(ord(x), 'b') for x in string)
>>> print(result)
Output:
1000111 1100101 1100101 1101011 1000010 1101001 1110100 1110011
In this case, the ord()
function takes an input character from looping the string. The function will then convert that character into its Unicode value.
Method 4 - Using Python Bin, Map, and Bytearray() Methods
Another of the powerful methods and techniques that we can use in Python to convert a string into binary is a combination of three methods.
How it works
We can use the map()
method to pass the byte object from the bytearray()
method. Once we pass each byte object to the bin()
function, we can get the binary equivalent of each byte.
Finally, using a constructor, we can use the object returned by the map() method to convert it into a list. We can also use list comprehensions to create a binary string from the object.
An example is as shown:
>>> string = "GeekBits"
>>> result = ' '.join([x[2:] for x in list(map(bin, bytearray(string, 'utf-8')))])
>>> print(result)
In the example above, we start by converting the string into a byte object using the bytearray(string, 'utf-8')
method.
Next, we convert each byte into its binary string using the map(bin, ...)
method.
We also ensure to string the 0b
prefix from the binary string using list comprehension and selecting from index 2 [x[2:] for x in ...]
.
And lastly, we join the binary strings with spaces: ' '.join(...)
and print the result.
Output:
1000111 1100101 1100101 1101011 1000010 1101001 1110100 1110011
Method 5 - Using the Bitarray Library
We can also use the bitarray
library to convert a string into its binary representation.
Start by installing the bitarray
library using pip:
pip install bitarray
Next, import the library and use it to convert a string into binary as shown:
>>> from bitarray import bitarray
>>> string = "GeekBits"
>>> res = bitarray()
>>> a = bitarray()
>>> a.frombytes(string.encode('utf-8'))
>>> print(a.to01())
Output:
0100011101100101011001010110101101000010011010010111010001110011
You can learn more about Python’s bitarray lib in the link below:
https://pypi.org/project/bitarray
Method 6 - Using BinAscii
We can also use the binascii
module in Python, which contains several methods to convert between binary and various ASCII-encoded binary representations.
One of those methods is the hexlify
method, which allows us to convert binary data into hexadecimal. We can also specify the base for this method as base 16 to convert it into an integer object.
Lastly, we can conver the result into binary using the bin
function.
>>> import binascii
>>> string = "GeekBits"
>>> b = bytes(string, 'utf-8')
>>> result = bin(int(binascii.hexlify(b), 16))
>>> print(result[2:])
Output:
100011101100101011001010110101101000010011010010111010001110011
Conclusion
In this tutorial, you learned various powerful methods and techniques that you can use to convert a string into its binary representation in Python. You can try each of the methods and see which suits you best.
If you enjoyed our tutorials, be sure to subscribe and share.