Filetype-Specific Accessors¶

soundDENA includes Accessor instances for the following filetypes. (They’re available as soundDENA.<filetype>, i.e. soundDENA.nvspl(sites), soundDENA.srcid.all(sites), etc.) This page lists the documentation for each of their parse() methods—how they behave on data from a single site. This behavior is then applied to multiple sites using __call__(), or concatenated using all().

Important

Remember, you will be using these wrapped up in instances of soundDENA.Accessor. Though this page implies that soundDENA.nvspl returns a DataFrame, soundDENA.nvspl(sites) is actually an iterator, which yields such a DataFrame for every site.

soundDENA.nvspl¶

Read all the NVSPL files in a directory into a single pandas DataFrame, indexed by date.

To speed up the process during testing, set the interval: only every ith file will be read. You can also specify only the columns you need, reducing import time and memory footprint.

Keyword Arguments:
Parameters:	filepaths (list of str or pathlib.Path) – List of paths to NVSPL files, all of which will be read into one DataFrame. Assumed to be sorted.
	interval (int, default 1) – Only every i files in the directory will be imported onlyColumns (sequence of column names (str) or indicies (int), default None) – Only these columns will be imported. Use None to read all columns. If names are strings, they must be the original names in the NVSPL file, like “12ph5”. If “STime” or 1 is not included, it will be added automatically, as these are necessary for indexing by date. quiet (boolean, default True) – Whether to not print progress reading files
Returns:	DataFrame – Indexed by date, with frequency column names as decimals instead of “12p5h”

soundDENA.srcid¶

Read a SPLAT SRCID file into a pandas DataFrame.

The nvsplDate, hr, and secs columns are combined into a single DatetimeIndex for the DataFrame and dropped. The len column (length of the noise event) is converted to a pandas Timedelta.

Returns:	DataFrame

soundDENA.loudevents¶

Read a LOUNDEVENTS file into a pandas Panel.

The items axis (axis 0) is [“above”, “all”, “percent”]. So you’d use events["above"] to get a sub-DataFrame of events that exceeded $L_{nat}$, where rows are indexed by date, and columns from 0 to 23 hours.
The major_axis (axis 1) is date: pandas DateTime objects
The minor_axis (axis 2) is hour: 0 to 23

Returns:	Panel

soundDENA.dailypa¶

Read a DAILYPA (percent time audible) file into a MultiIndexed pandas DataFrame.

The result will be indexed on two levels: date and srcid. This allows for interesting sub-indexing, such as:

>> data.loc["2013-06-29", :]
-> just srcid rows for 6/29/13, and all columns

>> data.loc[(slice(None), "Total_All"), :]
-> all dates, but just "Total_All" srcid rows, and all columns

>> df.loc[(slice(None), "Total_All"), "00-23h"]
-> all dates, just "Total_All" srcid rows, and only the 00-23h column.
   (Basically, a Series of total percent time audible per day.)

>> df.loc[(slice(None), slice("1.1", "1.3")), "00h":"23h"]
-> all dates, but just srcid rows between 1.1 and 1.3, and only columns from 00h to 23h.

For more, read the pandas docs for heirarchical indexing.

Returns:	MultiIndexed DataFrame

soundDENA.metrics¶

Read all tables from a metrics file.

Returns an object (a named tuple, really) with attributes for each metric in the file, as well as metadata, which is a dict of the colon-seperated key-value pairs in the file’s header. Missing metrics are stored as None. (SPLAT-related metrics such as noiseFreeInterval are often missing.)

Otherwise, each attribute for a table has two attributes itself: data and n. data contains a pandas Panel of that table’s data. n contains a DataFrame or Series of TimeDeltas of the lengths of the dataset, by season and table type.

In other words, the retured object is structured:

metrics
    metadata: {"Day": "07:00:00 to 18:59:59", "Source of Interest": "Aircraft", ...}
    hourlyMedian
        data: Panel
        n: DataFrame
    frequency
        data: Panel
        n: DataFrame
    ambient
        data: Panel
        n: DataFrame
    ...
    ...

A primary purpose of this reader is to combine multiple tables of related data in the metrics file into single structures. (For example, Median Hourly Metrics could have four tables, for dBA and dBT in both Summer and Winter. These are combined into a single Panel, making it easy to perform complex selections across the tables—e.g. dBA in both seasons.)

Here’s how these data Panels are indexed:

For metrics composed of multiple tables:

Labels axis: Season (“Winter”, “Summer”, ...)

Items axis: Table type (“dBA” and “dBT”; “night” and “day”; “l90”, “lnat”, and “l50”; ...)

Major axis: Table columns (“12.5Hz” to “20000Hz”; 0 to 23; “Lmin”, “L099”, “Lnat”, ...)

Minor axis: Table rows (“L090”, “Lnat”, “L050”; “Day”, “Night”, “overall”; 0 to 23; 1.1, 1.2, 1.3, ...)

So these are accessed data.loc[ <season>, <tableType>, <columns>, <rows> ]

For metrics composed of just one table:

Items axis: Season (“Winter”, “Summer”, ...)

Major axis: Table columns (“12.5Hz” to “20000Hz”; 0 to 23; “Lmin”, “L099”, “Lnat”, ...)

Minor axis: Table rows (“L090”, “Lnat”, “L050”; “Day”, “Night”, “overall”; 0 to 23; 1.1, 1.2, 1.3, ...)

So these are accessed data.loc[ <season>, <columns>, <rows> ]

And the n DataFrames or Series are indexed:

For metrics composed of multiple tables (DataFrame):

Columns: Season (“Winter”, “Summer”, ...)

Rows: Table type (“dBA” and “dBT”; “night” and “day”; “l90”, “lnat”, and “l50”; ...)

So these are accessed n.loc[ <season>, <tableType> ]

For metrics composed of just one table (Series):

Rows: Season (“Winter”, “Summer”, ...)

So these are accessed n[ <season> ]

Examples (where the object returned from this function is stored as metrics):

>> metrics.noiseFreeInterval.data
-> a Panel of the SPLAT Noise Free Interval (sec) table for each season, indexed by [season, percentile, hour]

>> metrics.noiseFreeInterval.n
-> a Series of the number of days used to compute the noise free interval metric, with one row per season

>> metrics.hourlyMedian.data
-> a Panel4D of the Median Hourly Metrics tables for dBA and dBT for each season, indexed by [season, spl weighting, percentile, hour]

>> metrics.hourlyMedian.data.Summer.dBA
>> metrics.hourlyMedian.data.loc["Summer", "dBA"]
-> a DataFrame subselecting the dBA item from the Summer label in the Median Hourly Metrics panel
-> (essentially, just the original ``Median Hourly Metrics (dBA), Summer`` table in the metrics file)

>> metrics.hourlyMedian.data.loc["Winter", :, "Leq", 0:12]
-> a DataFrame of median Leq values from the hours 0-12 (rows) for both dBA and dBT (columns) in Winter

>> metrics.hourlyMedian.data.loc[:, :, "Leq", 0:12].mean(axis= "items")
-> a DataFrame of the mean across all seasons of median Leq values from the hours 0-12 (columns) for both dBA and dBT (rows)

Special Cases:

Hour axes have the 'h' removed, so they are just integers 0-23.

The frequency, ambient, and percentTimeAbove metrics have the additional row "overall" added along with "Day" and "Night". This is the logarithmic mean SPL for both day and night.

Raises:	`TypeError` – If the version of the file does not match the reader `ValueError` – If the header cannot be parsed `OSError` – Unhandled—raised if the file is missing