Understanding Python dataclasses - LogRocket Blog

 
notion imagenotion image
The Python 3.7 release saw a new feature introduced: dataclasses.
For reference, a class is basically a blueprint for creating objects. An example of a class could be a country, which we would use the Country class to create various instances, such as Monaco and Gambia.
When initializing values, the properties supplied to the constructor (like population, languages, and so on) are copied into each object instance:
classCountry: def __init__(self, name: str, population: int, continent: str, official_lang: str): self.name = name self.population = population self.continent = continent self.official_lang = official_lang smallestEurope = Country("Monaco", 37623, "Europe") smallestAsia= Country("Maldives", 552595, "Asia") smallestAfrica= Country("Gambia", 2521126, "Africa")
If you ever worked with object-oriented programming (OOP) in programming languages like Java and Python, then you should already be familiar with classes.
A dataclass, however, comes with the basic class functionalities already implemented, decreasing the time spent writing code.
In this article, we’ll delve further into what dataclasses in Python are, how to manipulate object fields, how to sort and compare dataclasses, and more.
Note that because this was released in Python 3.7, you must have a recent version of Python installed on your local machine to use it.

What is a Python dataclass?

As mentioned previously, Python dataclasses are very similar to normal classes, but with implemented class functionalities that significantly decrease the amount of boilerplate code required to write.
An example of such boilerplate is the __init__ method.
In the Country class example, you can observe that we had to manually define the __init__ method, which gets called when you initialize the class. Now, for every normal class you define, you are required to provide this function, which means you must write a lot of repetitive code.
The Python dataclass comes with this method already defined. So, you can write the same Country class without manually defining a constructor.
Under the hood, @dataclass calls this method when you initialize the object with new properties.
Note that __init__ is not the only method provided by default. Other utility methods like __repr__ (representation), __lt__ (less than), __gt__ (greater than), __eq__ (equal to), and many others are also implemented by default.

Using the normal Python class

When working with a normal class in Python, we have longer code to implement the base methods.
Consider the Country class again. In the code block below, you can see a couple of methods, starting with the __innit__ method. This method initializes attributes like the country name, population count, continent, and official language on a Country instance.
__repr__ returns the string representation of a class instance. This prints the attributes of each class instance in a string form.
_lt_ compares the population of two Country instances and returns True if the present instance has a lesser population, while _eq_ returns True if they both have the same population count:

More great articles from LogRocket:

  • Don't miss a moment with The Replay, a curated newsletter from LogRocket
  • Discover popular ORMs used in the TypeScript landscape
classCountry: def __init__(self, name: str, population: int, continent: str, official_lang: str="English"): self.name = name self.population = population self.continent = continent self.official_lang= official_lang def __repr__(self): return(f"Country(name={self.name}, population={self.population}, continent={self.continent}, official_lang={self.official_lang})") def __lt__(self, other): return self.population < other.population def __eq__(self, other): return self.population == other.population smallestAfrica= Country("Gambia", 2521126, "Africa", "English") smallestEurope = Country("Monaco", 37623, "Europe", "French") smallestAsia1= Country("Maldives", 552595, "Asia", "Dhivehi") smallestAsia2= Country("Maldives", 552595, "Asia", "Dhivehi")print(smallestAfrica) # Country(name='Gambia', population=2521126, continent='Africa', #official_lang='English')print(smallestAsia < smallestAfrica) # Trueprint(smallestAsia > smallestAfrica) # False

Using the Python dataclass

To use Python’s dataclass in your code, simply import the module and register the @dataclass decorator on top of the class. This injects the base class functionalities into our class automatically.
In the following example, we’ll create the same Country class, but with far less code:
from dataclasses import dataclass @dataclass(order=True) class Country: name: str population: int continent: str official_lang: str smallestAfrica= Country("Gambia", 2521126, "Africa", "English") smallestEurope = Country("Monaco", 37623, "Europe", "French") smallestAsia1= Country("Maldives", 552595, "Asia", "Dhivehi") smallestAsia2= Country("Maldives", 552595, "Asia", "Dhivehi")# Country(name='Gambia', population=2521126, continent='Africa', #official_lang='English')print(smallestAsia1 == smallestAsia2) # Trueprint(smallestAsia < smallestAfrica) # False
Observe that we didn’t define a constructor method on the dataclass; we just defined the fields.
We also omitted helpers like repr and __eq__. Despite the omission of these methods, the class still runs normally.
Note that for less than (<), dataclass uses the default method for comparing objects. Later on in this article, we will learn how to customize object comparison for better results.

Manipulating object fields using the field() function

The dataclass module also provides a function called field(). This function gives you ingrained control over the class fields, allowing you to manipulate and customize them as you wish.
For example, we can exclude the continent field when calling the representation method by passing it a repr parameter and setting the value to false:
from dataclasses import dataclass, field @dataclass class Country: name: str population: int continent: str = field(repr=False) # omits the field official_lang: str smallestEurope = Country("Monaco", 37623, "Europe", "French")print(smallestEurope)# Country(name='Monaco', population=37623, official_lang='French')
This code then outputs in the CLI:
notion imagenotion image
By default, repr is always set to True
Here are some other parameters that can be taken in by field().

init parameter

The init parameter passes to specify whether an attribute should be included as an argument to the constructor during initialization. If you set a field to innit=False, then you must omit the attribute during initialization. Otherwise, a TypeError will be thrown:
from dataclasses import dataclass, field @dataclassclassCountry: name: str population: int continent: str official_lang: str = field(init=False) #Do not pass in this attribute in the constructor argument smallestEurope = Country("Monaco", 37623, "Europe", "English") #But you did, so error!print(smallestEurope)
This code then outputs in the CLI:
notion imagenotion image

default parameter

The default parameter is passed to specify a default value for a field in case a value is not provided during initialization:
from dataclasses import dataclass, field @dataclassclassCountry: name: str population: int continent: str official_lang: str = field(default="English") # If you ommit value, English will be used smallestEurope = Country("Monaco", 37623, "Europe") #Omitted, so English is usedprint(smallestEurope)
This code then outputs in the CLI:
notion imagenotion image

repr parameter

The repr parameter passes to specify if the field should be included (repr=True) or excluded (repr=False) from the string representation, as generated by the __repr__ method:
from dataclasses import dataclass, field @dataclassclassCountry: name: str population: int continent: str official_lang: str = field(repr=False) # This field will be excluded from string representation smallestEurope = Country("Monaco", 37623, "Europe", "French") print(smallestEurope)
This code then outputs in the CLI:
notion imagenotion image

Modifying fields after initialization with __post_init__

The __post_init__ method is called just after initialization. In other words, it is called after the object receives values for its fields, such as name, continent, population, and official_lang.
For example, we will use the method to determine if we are going to migrate to a country or not, based on the country’s official language:
from dataclasses import dataclass, field @dataclassclassCountry: name: str population: int continent: str = field(repr=False) # Excludes the continent field from string representation will_migrate: bool = field(init=False) # Initialize without will_migrate attribute official_lang: str = field(default="English") # Sets default language. Attributes with default values must appear last def __post_init__(self): if self.official_lang == "English": self.will_migrate == True else: self.will_migrate == False
After the object initializes with values, we perform a check to see if the official_lang field is set to English from inside post_init. If so, we must set the will_migrate property to true. Otherwise, we set it to false.

Sort and compare dataclasses with sort_index

Another functionality of dataclasses is the ability to create a custom order for comparing objects and sorting lists of objects.
For example, we can compare two countries by their population numbers. In other words, we want to say that one country is greater than another country if, and only if, its population count is greater than the other:
from dataclasses import dataclass, field @dataclass(order=True)classCountry: sort_index: int = field(init=False) name: str population: int = field(repr=True) continent: str official_lang: str = field(default="English") #Sets default value for official language def __post_init__(self): self.sort_index = self.population smallestEurope = Country("Monaco", 37623, "Europe") smallestAsia= Country("Maldives", 552595, "Asia") smallestAfrica= Country("Gambia", 2521126, "Africa") print(smallestAsia < smallestAfrica) # Trueprint(smallestAsia > smallestAfrica) # False
To enable comparison and sorting in a Python dataclass, you must pass the order property to @dataclass with the true value. This enables the default comparison functionality.
Since we want to compare by population count, we must pass the population field to the sort_index property after initialization from inside the __post_innit__ method.
You can also sort a list of objects using a particular field as the sort_index. For example, we must sort a list of countries by their population count:
from dataclasses import dataclass, field @dataclass(order=True)classCountry: sort_index: int = field(init=False) name: str population: int = field(repr=True) continent: str official_lang: str = field(default="English") def __post_init__(self): self.sort_index = self.population europe = Country("Monaco", 37623, "Europe", "French") asia = Country("Maldives", 552595, "Asia", "Dhivehi") africa = Country("Gambia", 2521126, "Africa", "English") sAmerica = Country("Suriname", 539000, "South America", "Dutch") nAmerica = Country("St Kits and Nevis", 55345, "North America", "English") oceania = Country("Nauru", 11000, "Oceania", "Nauruan") mylist = [europe, asia, africa, sAmerica, nAmerica, oceania] mylist.sort()print(mylist) # This will return a list of countries sorted by population count, as shown below
This code then outputs in the CLI:
notion imagenotion image
Don’t want the dataclass to be tampered with? You can freeze the class by simply passing a frozen=True value to the decorator:
from dataclasses import dataclass, field @dataclass(order=True, frozen=True)classCountry: sort_index: int = field(init=False) name: str population: int = field(repr=True) continent: str official_lang: str = field(default="English") def __post_init__(self): self.sort_index = self.population

Wrapping up

A Python dataclass is a very powerful feature that drastically reduces the amount of code in class definitions. The module provides most of the basic class methods already implemented. You can customize the fields in a dataclass and restrict certain actions.

LogRocket: Full visibility into your web and mobile apps

notion imagenotion image
LogRocket is a frontend application monitoring solution that lets you replay problems as if they happened in your own browser. Instead of guessing why errors happen, or asking users for screenshots and log dumps, LogRocket lets you replay the session to quickly understand what went wrong. It works perfectly with any app, regardless of framework, and has plugins to log additional context from Redux, Vuex, and @ngrx/store.
In addition to logging Redux actions and state, LogRocket records console logs, JavaScript errors, stacktraces, network requests/responses with headers + bodies, browser metadata, and custom logs. It also instruments the DOM to record the HTML and CSS on the page, recreating pixel-perfect videos of even the most complex single-page and mobile apps.