The Python 3.7 release saw a new feature introduced:
dataclasses
.For reference, a class is basically a blueprint for creating objects. An example of a class could be a country, which we would use the
Country
class to create various instances, such as Monaco and Gambia.When initializing values, the properties supplied to the constructor (like population, languages, and so on) are copied into each object instance:
classCountry: def __init__(self, name: str, population: int, continent: str, official_lang: str): self.name = name self.population = population self.continent = continent self.official_lang = official_lang smallestEurope = Country("Monaco", 37623, "Europe") smallestAsia= Country("Maldives", 552595, "Asia") smallestAfrica= Country("Gambia", 2521126, "Africa")
If you ever worked with object-oriented programming (OOP) in programming languages like Java and Python, then you should already be familiar with classes.
A
dataclass
, however, comes with the basic class functionalities already implemented, decreasing the time spent writing code.In this article, we’ll delve further into what
dataclasses
in Python are, how to manipulate object fields, how to sort and compare dataclasses
, and more.Note that because this was released in Python 3.7, you must have a recent version of Python installed on your local machine to use it.
What is a Python dataclass
?
As mentioned previously, Python
dataclasses
are very similar to normal classes, but with implemented class functionalities that significantly decrease the amount of boilerplate code required to write.An example of such boilerplate is the
__init__
method.In the
Country
class example, you can observe that we had to manually define the __init__
method, which gets called when you initialize the class. Now, for every normal class you define, you are required to provide this function, which means you must write a lot of repetitive code.The Python
dataclass
comes with this method already defined. So, you can write the same Country
class without manually defining a constructor.Under the hood,
@dataclass
calls this method when you initialize the object with new properties.Note that
__init__
is not the only method provided by default. Other utility methods like __repr__
(representation), __lt__
(less than), __gt__
(greater than), __eq__
(equal to), and many others are also implemented by default.Using the normal Python class
When working with a normal class in Python, we have longer code to implement the base methods.
Consider the
Country
class again. In the code block below, you can see a couple of methods, starting with the __innit__
method. This method initializes attributes like the country name, population count, continent, and official language on a Country
instance.__repr__
returns the string representation of a class instance. This prints the attributes of each class instance in a string form._lt_
compares the population of two Country
instances and returns True
if the present instance has a lesser population, while _eq_
returns True
if they both have the same population count:More great articles from LogRocket:
- Don't miss a moment with The Replay, a curated newsletter from LogRocket
- Use React's useEffect to optimize your application's performance
- Learn how to animate your React app with AnimXYZ
- Explore Tauri, a new framework for building binaries
- Discover popular ORMs used in the TypeScript landscape
classCountry: def __init__(self, name: str, population: int, continent: str, official_lang: str="English"): self.name = name self.population = population self.continent = continent self.official_lang= official_lang def __repr__(self): return(f"Country(name={self.name}, population={self.population}, continent={self.continent}, official_lang={self.official_lang})") def __lt__(self, other): return self.population < other.population def __eq__(self, other): return self.population == other.population smallestAfrica= Country("Gambia", 2521126, "Africa", "English") smallestEurope = Country("Monaco", 37623, "Europe", "French") smallestAsia1= Country("Maldives", 552595, "Asia", "Dhivehi") smallestAsia2= Country("Maldives", 552595, "Asia", "Dhivehi")print(smallestAfrica) # Country(name='Gambia', population=2521126, continent='Africa', #official_lang='English')print(smallestAsia < smallestAfrica) # Trueprint(smallestAsia > smallestAfrica) # False
Using the Python dataclass
To use Python’s
dataclass
in your code, simply import the module and register the @dataclass
decorator on top of the class. This injects the base class functionalities into our class automatically.In the following example, we’ll create the same
Country
class, but with far less code:from dataclasses import dataclass @dataclass(order=True) class Country: name: str population: int continent: str official_lang: str smallestAfrica= Country("Gambia", 2521126, "Africa", "English") smallestEurope = Country("Monaco", 37623, "Europe", "French") smallestAsia1= Country("Maldives", 552595, "Asia", "Dhivehi") smallestAsia2= Country("Maldives", 552595, "Asia", "Dhivehi")# Country(name='Gambia', population=2521126, continent='Africa', #official_lang='English')print(smallestAsia1 == smallestAsia2) # Trueprint(smallestAsia < smallestAfrica) # False
Observe that we didn’t define a constructor method on the
dataclass
; we just defined the fields.We also omitted helpers like
repr
and __eq__
. Despite the omission of these methods, the class still runs normally.Note that for less than (
<
), dataclass
uses the default method for comparing objects. Later on in this article, we will learn how to customize object comparison for better results.Manipulating object fields using the field()
function
The
dataclass
module also provides a function called field()
. This function gives you ingrained control over the class fields, allowing you to manipulate and customize them as you wish.For example, we can exclude the
continent
field when calling the representation method by passing it a repr
parameter and setting the value to false
:from dataclasses import dataclass, field @dataclass class Country: name: str population: int continent: str = field(repr=False) # omits the field official_lang: str smallestEurope = Country("Monaco", 37623, "Europe", "French")print(smallestEurope)# Country(name='Monaco', population=37623, official_lang='French')
This code then outputs in the CLI:
By default,
repr
is always set to True
Here are some other parameters that can be taken in by
field()
.init
parameter
The
init
parameter passes to specify whether an attribute should be included as an argument to the constructor during initialization. If you set a field to innit=False
, then you must omit the attribute during initialization. Otherwise, a TypeError
will be thrown:from dataclasses import dataclass, field @dataclassclassCountry: name: str population: int continent: str official_lang: str = field(init=False) #Do not pass in this attribute in the constructor argument smallestEurope = Country("Monaco", 37623, "Europe", "English") #But you did, so error!print(smallestEurope)
This code then outputs in the CLI:
default
parameter
The
default
parameter is passed to specify a default value for a field in case a value is not provided during initialization:from dataclasses import dataclass, field @dataclassclassCountry: name: str population: int continent: str official_lang: str = field(default="English") # If you ommit value, English will be used smallestEurope = Country("Monaco", 37623, "Europe") #Omitted, so English is usedprint(smallestEurope)
This code then outputs in the CLI:
repr
parameter
The
repr
parameter passes to specify if the field should be included (repr=True
) or excluded (repr=False
) from the string representation, as generated by the __repr__
method:from dataclasses import dataclass, field @dataclassclassCountry: name: str population: int continent: str official_lang: str = field(repr=False) # This field will be excluded from string representation smallestEurope = Country("Monaco", 37623, "Europe", "French") print(smallestEurope)
This code then outputs in the CLI:
Modifying fields after initialization with __post_init__
The
__post_init__
method is called just after initialization. In other words, it is called after the object receives values for its fields, such as name
, continent
, population
, and official_lang
.For example, we will use the method to determine if we are going to migrate to a country or not, based on the country’s official language:
from dataclasses import dataclass, field @dataclassclassCountry: name: str population: int continent: str = field(repr=False) # Excludes the continent field from string representation will_migrate: bool = field(init=False) # Initialize without will_migrate attribute official_lang: str = field(default="English") # Sets default language. Attributes with default values must appear last def __post_init__(self): if self.official_lang == "English": self.will_migrate == True else: self.will_migrate == False
After the object initializes with values, we perform a check to see if the
official_lang
field is set to English
from inside post_init
. If so, we must set the will_migrate
property to true
. Otherwise, we set it to false
.Sort and compare dataclasses
with sort_index
Another functionality of
dataclasses
is the ability to create a custom order for comparing objects and sorting lists of objects.For example, we can compare two countries by their population numbers. In other words, we want to say that one country is greater than another country if, and only if, its population count is greater than the other:
from dataclasses import dataclass, field @dataclass(order=True)classCountry: sort_index: int = field(init=False) name: str population: int = field(repr=True) continent: str official_lang: str = field(default="English") #Sets default value for official language def __post_init__(self): self.sort_index = self.population smallestEurope = Country("Monaco", 37623, "Europe") smallestAsia= Country("Maldives", 552595, "Asia") smallestAfrica= Country("Gambia", 2521126, "Africa") print(smallestAsia < smallestAfrica) # Trueprint(smallestAsia > smallestAfrica) # False
To enable comparison and sorting in a Python
dataclass
, you must pass the order
property to @dataclass
with the true
value. This enables the default comparison functionality.Since we want to compare by population count, we must pass the
population
field to the sort_index
property after initialization from inside the __post_innit__
method.You can also sort a list of objects using a particular field as the
sort_index
. For example, we must sort a list of countries by their population count:from dataclasses import dataclass, field @dataclass(order=True)classCountry: sort_index: int = field(init=False) name: str population: int = field(repr=True) continent: str official_lang: str = field(default="English") def __post_init__(self): self.sort_index = self.population europe = Country("Monaco", 37623, "Europe", "French") asia = Country("Maldives", 552595, "Asia", "Dhivehi") africa = Country("Gambia", 2521126, "Africa", "English") sAmerica = Country("Suriname", 539000, "South America", "Dutch") nAmerica = Country("St Kits and Nevis", 55345, "North America", "English") oceania = Country("Nauru", 11000, "Oceania", "Nauruan") mylist = [europe, asia, africa, sAmerica, nAmerica, oceania] mylist.sort()print(mylist) # This will return a list of countries sorted by population count, as shown below
This code then outputs in the CLI:
Don’t want the
dataclass
to be tampered with? You can freeze the class by simply passing a frozen=True
value to the decorator:from dataclasses import dataclass, field @dataclass(order=True, frozen=True)classCountry: sort_index: int = field(init=False) name: str population: int = field(repr=True) continent: str official_lang: str = field(default="English") def __post_init__(self): self.sort_index = self.population
Wrapping up
A Python
dataclass
is a very powerful feature that drastically reduces the amount of code in class definitions. The module provides most of the basic class methods already implemented. You can customize the fields in a dataclass
and restrict certain actions.LogRocket: Full visibility into your web and mobile apps
LogRocket is a frontend application monitoring solution that lets you replay problems as if they happened in your own browser. Instead of guessing why errors happen, or asking users for screenshots and log dumps, LogRocket lets you replay the session to quickly understand what went wrong. It works perfectly with any app, regardless of framework, and has plugins to log additional context from Redux, Vuex, and @ngrx/store.
In addition to logging Redux actions and state, LogRocket records console logs, JavaScript errors, stacktraces, network requests/responses with headers + bodies, browser metadata, and custom logs. It also instruments the DOM to record the HTML and CSS on the page, recreating pixel-perfect videos of even the most complex single-page and mobile apps.