The normalize()
function¶
Each base section defined using the NOMAD schema has a set of public functions which can be used at any moment when reading and parsing files in NOMAD. The normalize(archive, logger)
function is a special case of such functions, which warrants an in-depth description.
This function is run within the NOMAD infrastructure by the MetainfoNormalizer
in the following order:
- A child section's
normalize()
function is run before their/its parents'normalize()
function. - For sibling sections, the
normalize()
function is executed from the smaller to the largernormalizer_level
attribute. Ifnormalizer_level
is not set or if they are the same for two different sections, the order is established by the attributes definition order in the parent section. - Using
super().normalize(archive, logger)
runs the inherited section normalize function.
Let's see some examples. Imagine having the following Section
and SubSection
structure:
from nomad.datamodel.data import ArchiveSection
class Section1(ArchiveSection):
normalizer_level = 1
def normalize(self, achive, logger):
# some operations here
pass
class Section2(ArchiveSection):
normalizer_level = 0
def normalize(self, achive, logger):
super().normalize(archive, logger)
# Some operations here or before `super().normalize(archive, logger)`
class ParentSection(ArchiveSection):
sub_section_1 = SubSection(Section1.m_def, repeats=False)
sub_section_2 = SubSection(Section2.m_def, repeats=True)
def normalize(self, achive, logger):
super().normalize(archive, logger)
# Some operations here or before `super().normalize(archive, logger)`
Now, MetainfoNormalizer
will be run on the ParentSection
. Applying rule 1, the normalize()
functions of the ParentSection
's childs are executed first. The order of these functions is established by rule 2 with the normalizer_level
atrribute, i.e., all the Section2
(note that sub_section_2
is a list of sections) normalize()
functions are run first, then Section1.normalize()
. Then, the order of execution will be:
Section2.normalize()
Section1.normalize()
ParentSection.normalize()
In case we do not assign a value to Section1.normalizer_level
and Section2.normalizer_level
, Section1.normalize()
will run first before Section2.normalize()
, due to the order of SubSection
attributes in ParentSection
. Thus the order will be in this case:
Section1.normalize()
Section2.normalize()
ParentSection.normalize()
By checking on the normalize()
functions and rule 3, we can establish whether ArchiveSection.normalize()
will be run or not. In Section1.normalize()
, it will not, while in the other sections, Section2
and ParentSection
, it will.