Reading a sub-region of a particle catalogue (ReadRegion class)¶
In many cases, we only need to load properties for particles within a small, well-defined sub-volume of the simulation. Loading the entire catalogue entry (with SplitFile) is not very efficient in this case, both in terms of speed and memory usage. Instead, the ReadRegion class can be used to only load particles within the region of interest.
Note
This functionality requires that the simulation outputs are stored in a particular order, and a particle map has been set up. This is the case for all snap- and snipshots from the HYDRO runs of the Hydrangea simulations, but not for the other C-EAGLE simulations, nor for any of the DM-only counterparts. The ReadRegion class can still be used in those cases, but it will default to reading the full catalogue entry.
General usage¶
The first step is to set up an instance of the class, specifying the particle type, selection volume, and possibly other parameters (see hydrangea.ReadRegion below for all options). Once this is done, any property of the selected particles can be accessed directly as an attribute of the class instance (e.g. gas.Coordinates for the particle coordinates, assuming the instance has the name gas). By default, data is returned in “astronomically sensible” units, other systems (e.g. cgs) can be specified through the read_units attribute.
For convenience, methods and properties to read and sum particle properties directly, connect particles to structures and obtaining various time-stamps of the catalogue are also provided.
Examples¶
Below are a few examples of how to use the ReadRegion class; all of them assume that 'snapshot_file' has been set up to point to one of the files of the snapshot catalogue to read from. More complete use cases are provided in e.g. the star_density.py and sf_history.py scripts in the examples directory.
Read all gas particle star formation rates within 100 pkpc from (100.0, 100.0, 100.0) pMpc (but accept particles at larger radii):
gas = hydrangea.ReadRegion(snapshot_file, 0, [100.0, 100.0, 100.0], 0.1)
print(gas.StarFormationRate)
As above, but using the exact keyword to only include particles strictly within the selected sphere:
gas = hydrangea.ReadRegion(snapshot_file, 0, [100.0, 100.0, 100.0], 0.1, exact=True)
Specifying the centre and size of the selection region in data units (co-moving h-1 Mpc) instead:
gas = hydrangea.ReadRegion(snapshot_file, 0, [1084.0, 1084.0, 1084.0], 0.1,
coordinate_units='data')
Load star particles (part_type == 4) within a box of size (1.0, 0.5, 3.0) pMpc, whose lower corner is at (100.0, 120.0, 110.0) pMpc:
stars = hydrangea.ReadRegion(snapshot_file, 4, [100.0, 120.0, 110.0], [1.0, 0.5, 3.0],
shape='box', anchor_style='bottom')
Reference¶
hydrangea.ReadRegion: Instantiate the class
Reading catalogue entries¶
hydrangea.ReadRegion.read_data(): Explicitly read a catalogue entryhydrangea.ReadRegion.total_in_region(): Total or average property of particleshydrangea.ReadRegion.get_unit_conversion(): Obtain unit conversion factorshydrangea.ReadRegion.m_baryon: Initial baryon masshydrangea.ReadRegion.m_dm: DM particle mass
Linking particles to structures¶
hydrangea.ReadRegion.GroupIndex: Emulate group index catalogue entryhydrangea.ReadRegion.SubhaloIndex: Emulate subhalo index catalogue entryhydrangea.ReadRegion.in_subhalo(): Identify members of a particular subhalohydrangea.ReadRegion.subfind_file: Subfind file associated to the catalogue
Catalogue time-stamps¶
hydrangea.ReadRegion.aexp: Expansion factor of cataloguehydrangea.ReadRegion.lookback_time: Lookback time in Gyr from z = 0 to the cataloguehydrangea.ReadRegion.redshift: Redshift of the cataloguehydrangea.ReadRegion.time: Age of the Universe at the catalogue
-
class
hydrangea.ReadRegion(file_name, part_type, anchor, size, shape='sphere', anchor_style='centre', verbose=1, exact=False, units='astro', coordinate_units=None, read_units=None, map_file=None, periodic=False, load_full=False, join_threshold=100, bridge_threshold=100, bridge_gap=0.5, sim_type='Eagle')¶ Set up a region for efficient reading of data from snapshot files.
This class can be called with several parameters to allow easy setup of commonly encountered selection regions (sphere, cube, box). Internally, the particle map generated by MapMaker is then read and processed into a (typically small) number of segments to read in.
Once set up, any catalogue entry can be accessed as an attribute of the class instance, they are loaded when first encountered. As an alternative, entries can also be read explicitly with the
read_data()method.Parameters: - file_name (string) – The path of the file containing the data to read. If the data is spread over multiple files, it can point to any one of them.
- part_type (int) – The particle type code to read (0=gas, 1=DM, 4=stars, 5=BHs)
- anchor (ndarray (3)) – The coordinates of the ‘anchor point’ of the selection region (its centre or corner, depending on anchor_style). Its units are specified by coordinate_units (see below).
- size (float or ndarray) – The extent of the selection region. The format of this
parameter depends on the region shape: a single float
for
'sphere'or'cube', or a sequence of three values for'box'. For a sphere, this specifies its radius, for a cube its half-side-length, and for a box the half-side-lengths in the x, y, and z dimension, respectively (note the different interpretations for cube and box if anchor_style is set to'bottom', as described below). - shape (string, optional) – The shape of the region to read from. Valid options are
'sphere'(default),'cube', or'box'(all case-insensitive). - anchor_style (string, optional) – Specifies the location of the anchor within the selection
region. Default is
'centre'('center'also accepted). The alternative,'bottom', places the anchor on the bottom x, y, z corner of the box or cube instead. This parameter has no effect for a sphere. - verbose (int, optional) – Frequency of log messages (default: 1 ==> few)
- exact (bool, optional) – Only load particles lying exactly within the specified region, at extra cost (internally reads particle positions). If False (default), typically also some particles slightly outside the selected region are loaded.
- units (str, optional) –
Specifies the unit system for both the anchor and size input, and any data read from the selected region. Options are
'data': as on file'clean': as ‘data’, but with a and h factors removed'astro': ‘astronomically sensible’ units (e.g. pMpc, M_sun)'si': the standard SI unit system'cgs': alternative unit system favoured by astronomers
Capitalization is ignored for all unit names.
- coordinate_units (str or None, optional) – Unit system in which to interpret the input coordinates. If None (default), the same as units is used. Capitalization is ignored.
- read_units (str or None, optional) – Unit system to which to convert read data (default: same as units). Capitalization is ignored.
- map_file (string, optional) – Location of the particle map file to use. If None (default),
it is assumed to be
'ParticleMap.hdf5'in the same directory as file_name. - periodic (bool, optional) – Assume that the simulation volume is periodic and completely tiled with particle map cells (default: False). This option is not applicable to any of the Hydrangea/C-EAGLE simulations.
- load_full (bool, optional) – Load entire particle catalogue irrespective of specified region (default: False). If exact is True, the data will still be cut to the exact shape of the selected region afterwards.
- join_threshold (int, optional) – Threshold number of segments above which directly adjoining segments are joined to speed up the reading (default: 100).
- bridge_threshold (int, optional) – Threshold number of segments remaining after directly neighbouring ones are joined to perform a second join round. In this, elements separated by a small gap are also joined (speeding up the reading at the expense of reading slightly more particles). Default: 100.
- bridge_gap (float, optional) – Maximum size of gaps to bridge in second joining round. Two segments are joined if the ratio between their combined and bridged lengths are greater than bridge_gap (default: 0.5).
Note
Setting up very large regions (>~ 20 Mpc) is not very efficient. Therefore, when the code detects that > 1M cells would have to be checked and potentially loaded, the region setup is abandoned and the entire particle catalogue will be read in. Full reading is also enforced when the sub-selection contains more than 40% of the full catalogue.
-
GroupIndex¶ Emulate a non-existing group index for all particles (attribute is computed on-the-fly when first accessed).
-
SubhaloIndex¶ Emulate a non-existing subhalo index for particles (attribute is computed on-the-fly when first accessed).
-
aexp¶ Expansion factor of the data set.
-
get_unit_conversion(dataset_name, units_name)¶ Get appropriate factor to convert data units to other system.
Parameters: - dataset_name (str) – The name of the data set for which to obtain the conversion factor (including possible containing groups, but not the base group).
- units_name (str) –
The unit system to calculate the conversion factor for. Options are (case-insensitive):
'data'–> Exactly as stored in file (i.e. no conversion)'clean'–> As in file, but without a and h factors'astro'–> Astronomically useful units (e.g. M_Sun, pMpc)'si'–> SI units'cgs'–> CGS units
Returns: data_to_other (float) – The conversion factor for the specified unit system. The ‘raw’ data as read from the file(s) must be multiplied with this value to obtain the magnitude in the target system.
Note
In particular in SI and CGS units, overflow issues may occur for 32-bit floats, because these have a maximum value of ~1e39.
Examples
>>> import hydrangea as hy >>> snap_file = hy.objects.Simulation(index=0).get_snap_file(29) >>> stars = hy.SplitFile(snap_file, part_type=4) # or hy.ReadRegion >>> stars.get_unit_conversion('Mass', 'astro') 14755791648.22193
>>> stars.get_unit_conversion('CentreOfPotential', 'SI') 4.553162166150214e+22
-
in_subhalo(subhalo_index, subhalo_file=None)¶ Identify members of a subhalo within the reader.
-
lookback_time¶ Lookback time to the data set from z = 0 [Gyr].
-
m_baryon¶ Initial baryon mass (only for snap-/snipshots).
-
m_dm¶ DM particle mass (only for snap-/snipshots).
-
read_data(dataset_name, units=None, verbose=None, exact=None, file_name=None, pt_name=None, single_file=False, store=False, trial=False, data_type=None)¶ Read a specified dataset within a previously set up region.
Using this function provides an alternative to accessing data sets as attributes, with more flexibility on e.g. the units (for instance, data can be read in another unit system than what was specified as read_units during instantiation).
Parameters: - dataset_name (string) – The dataset to read from, including groups where appropriate.
The leading
'PartType[x]'must however not be included! - units (str or None, optional) –
Specifies the unit system for the output. Options are:
'data': as on file'clean': as ‘data’, but with a and h factors removed'astro': ‘astronomically sensible’ units (e.g. pMpc, M_sun)'si': the standard SI unit system'cgs': alternative unit system favoured by astronomers
If
None(default), the class value read_units is used. Capitalization is ignored for all unit names. - verbose (int, optional) – Frequency of log messages. If None (default), use class value.
- exact (bool, optional) – Only return data for particles lying in the exact specified selection region (default: class value).
- file_name (string, optional) – Specifies an alternative path to read data from. This is useful for reading data from ancillary catalogues. By default (None), the file name used to set up the class instance is used.
- pt_name (string, optional) – Specifies an alternative particle-type group name. By default
(None),
'PartType[x]'is used. - single_file (bool, optional) – Assume that data resides in an un-split file (default: False). This is used only for ancillary catalogues.
- store (str or None or False, optional) – Store the retrieved array as an attribute with this name. If None, the (full) name of the data set is used, with ‘/’ replaced by ‘__’. Default: False (do not store).
- trial (bool, optional) – Attempt to read the data set. If it does not yield the expected number of elements for any one file or total, return None. If False (default), enter debug mode in this case.
- data_type (str, optional) – Store read data in an array of this data type. If None (default), this is determined from the HDF5 data set.
Returns: data (array) – The data read for the particles in the selected region.
Note
The selection of which particles to load has already been done when the ReadRegion object was instantiated.
- dataset_name (string) – The dataset to read from, including groups where appropriate.
The leading
-
redshift¶ Redshift of the data set.
-
subfind_file¶ Subfind catalogue file associated to a snapshot.
This must be set explicitly by the user. It can point to the output’s own subfind file (if it exists), but does not need to: in the latter case, it allows matching particles to structures at another point in time.
-
time¶ Age of the Universe of the data set [Gyr].
-
total_in_region(dataset_name, average=False, weight_quant=None, units='astro')¶ Compute the total or average of a quantity [convenience function].
Parameters: - dataset_name (string) – The (full) name of the data set to process, including potential groups that contain it (but not the base group).
- average (bool, optional) – Compute the average of particles instead of the sum (default).
- weight_quant (string, optional) – The (full) name of a data set to use as weights. If None (default), no weighting is performed. Supplying a weight_quant implicitly also sets average=True. It is the user’s responsibility to ensure that the weights do not sum to zero.
- units (str, optional) – Unit system to convert output to, as for
read_data().
Returns: sum (array) – The sum or average over all particles in the selection region.